1. 11 3月, 2019 1 次提交
    • S
      perf/core: Restore mmap record type correctly · d9c1bb2f
      Stephane Eranian 提交于
      On mmap(), perf_events generates a RECORD_MMAP record and then checks
      which events are interested in this record. There are currently 2
      versions of mmap records: RECORD_MMAP and RECORD_MMAP2. MMAP2 is larger.
      The event configuration controls which version the user level tool
      accepts.
      
      If the event->attr.mmap2=1 field then MMAP2 record is returned.  The
      perf_event_mmap_output() takes care of this. It checks attr->mmap2 and
      corrects the record fields before putting it in the sampling buffer of
      the event.  At the end the function restores the modified MMAP record
      fields.
      
      The problem is that the function restores the size but not the type.
      Thus, if a subsequent event only accepts MMAP type, then it would
      instead receive an MMAP2 record with a size of MMAP record.
      
      This patch fixes the problem by restoring the record type on exit.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Fixes: 13d7a241 ("perf: Add attr->mmap2 attribute to an event")
      Link: http://lkml.kernel.org/r/20190307185233.225521-1-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d9c1bb2f
  2. 10 3月, 2019 1 次提交
    • I
      Merge tag 'perf-core-for-mingo-5.1-20190307' of... · b339da48
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-5.1-20190307' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core changes from Arnaldo Carvalho de Melo:
      
      perf bpf:
      
        Arnaldo Carvalho de Melo:
      
        - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
          tools such as 'bpftool map dump' can pretty print map keys and values.
      
      perf c2c:
      
        Jiri Olsa:
      
        - Fix report for empty NUMA node.
      
      perf diff:
      
        Jin Yao:
      
        - Support --time, --cpu, --pid and --tid filter options.
      
      perf probe:
      
        Arnaldo Carvalho de Melo:
      
        - Clarify error message about not finding kernel modules debuginfo.
      
      perf record:
      
        Jiri Olsa:
      
        - Fixup probing for max attr.precise_ip.
      
      perf trace:
      
        Arnaldo Carvalho de Melo:
      
        - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
      
      perf annotate:
      
        Arnaldo Carvalho de Melo:
      
        - Calculate the max instruction name, align column to that, removing the
          hardcoded max 6 chars and cope with instructions with names longer than that,
          such as vpmovmskb, vpcmpeqb, etc.
      
      kernel:
      
        Song Liu:
      
        - Consider events with attr.bpf_event set as side-band.
      
        Gustavo A. R. Silva:
      
        - Mark expected switch fall-through in perf_event_parse_addr_filter().
      
      Libraries:
      
        Jiri Olsa:
      
        - Fix leaks and double frees on error paths.
      
      libtraceevent:
      
        Tony Jones:
      
        - Fix buffer overflow in arg_eval().
      
      python scripting:
      
        Tony Jones:
      
        - More python3 fixes.
      
      Trivial:
      
        Yang Wei:
      
        - Remove needless extra semicolon in clang C++ glue code.
      
      Intel PT/BTS:
      
        Adrian Hunter:
      
        - Improve auxtrace address filter error message when there is no DSO.
      
        - Fix divide by zero when TSC is not available.
      
        - Further improvements to the export to sqlite/posgresql python scripts
          and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
          the creation of call trees.
      
        Andi Kleen:
      
        - Generalize function to copy from thread addr space from intel-bts code.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b339da48
  3. 09 3月, 2019 3 次提交
    • G
      perf/core: Mark expected switch fall-through · 43aa378b
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warning:
      
        kernel/events/core.c: In function ‘perf_event_parse_addr_filter’:
        kernel/events/core.c:9154:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
            kernel = 1;
            ~~~~~~~^~~
        kernel/events/core.c:9156:3: note: here
           case IF_SRC_FILEADDR:
           ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190212205430.GA8446@embeddedorSigned-off-by: NIngo Molnar <mingo@kernel.org>
      43aa378b
    • K
      perf/x86/intel/uncore: Fix client IMC events return huge result · 8041ffd3
      Kan Liang 提交于
      The client IMC bandwidth events currently return very large values:
      
        $ perf stat -e uncore_imc/data_reads/ -e uncore_imc/data_writes/ -I 10000 -a
      
        10.000117222 34,788.76 MiB uncore_imc/data_reads/
        10.000117222 8.26 MiB uncore_imc/data_writes/
        20.000374584 34,842.89 MiB uncore_imc/data_reads/
        20.000374584 10.45 MiB uncore_imc/data_writes/
        30.000633299 37,965.29 MiB uncore_imc/data_reads/
        30.000633299 323.62 MiB uncore_imc/data_writes/
        40.000891548 41,012.88 MiB uncore_imc/data_reads/
        40.000891548 6.98 MiB uncore_imc/data_writes/
        50.001142480 1,125,899,906,621,494.75 MiB uncore_imc/data_reads/
        50.001142480 6.97 MiB uncore_imc/data_writes/
      
      The client IMC events are freerunning counters. They still use the
      old event encoding format (0x1 for data_read and 0x2 for data write).
      The counter bit width is calculated by common code, which assume that
      the standard encoding format is used for the freerunning counters.
      Error bit width information is calculated.
      
      The patch intends to convert the old client IMC event encoding to the
      standard encoding format.
      
      Current common code uses event->attr.config which directly copy from
      user space. We should not implicitly modify it for a converted event.
      The event->hw.config is used to replace the event->attr.config in
      common code.
      
      For client IMC events, the event->attr.config is used to calculate a
      converted event with standard encoding format in the custom
      event_init(). The converted event is stored in event->hw.config.
      For other events of freerunning counters, they already use the standard
      encoding format. The same value as event->attr.config is assigned to
      event->hw.config in common event_init().
      Reported-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@kernel.org # v4.18+
      Fixes: 9aae1780 ("perf/x86/intel/uncore: Clean up client IMC uncore")
      Link: https://lkml.kernel.org/r/20190227165729.1861-1-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8041ffd3
    • A
      perf/ring_buffer: Use high order allocations for AUX buffers optimistically · 5768402f
      Alexander Shishkin 提交于
      Currently, the AUX buffer allocator will use high-order allocations
      for PMUs that don't support hardware scatter-gather chaining to ensure
      large contiguous blocks of pages, and always use an array of single
      pages otherwise.
      
      There is, however, a tangible performance benefit in using larger chunks
      of contiguous memory even in the latter case, that comes from not having
      to fetch the next page's address at every page boundary. In particular,
      a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
      penalty with a single multi-page output region in snapshot mode (no PMI)
      than with multiple single-page output regions, from ~6% down to ~4%. For
      the snapshot mode it does make a difference as it is intended to run over
      long periods of time.
      
      For this reason, change the allocation policy to always optimistically
      start with the highest possible order when allocating pages for the AUX
      buffer, desceding until the allocation succeeds or order zero allocation
      fails.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190215114727.62648-2-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5768402f
  4. 07 3月, 2019 19 次提交
  5. 06 3月, 2019 16 次提交
    • L
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 203b6609
      Linus Torvalds 提交于
      Pull perf updates from Ingo Molnar:
       "Lots of tooling updates - too many to list, here's a few highlights:
      
         - Various subcommand updates to 'perf trace', 'perf report', 'perf
           record', 'perf annotate', 'perf script', 'perf test', etc.
      
         - CPU and NUMA topology and affinity handling improvements,
      
         - HW tracing and HW support updates:
            - Intel PT updates
            - ARM CoreSight updates
            - vendor HW event updates
      
         - BPF updates
      
         - Tons of infrastructure updates, both on the build system and the
           library support side
      
         - Documentation updates.
      
         - ... and lots of other changes, see the changelog for details.
      
        Kernel side updates:
      
         - Tighten up kprobes blacklist handling, reduce the number of places
           where developers can install a kprobe and hang/crash the system.
      
         - Fix/enhance vma address filter handling.
      
         - Various PMU driver updates, small fixes and additions.
      
         - refcount_t conversions
      
         - BPF updates
      
         - error code propagation enhancements
      
         - misc other changes"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
        perf script python: Add Python3 support to syscall-counts-by-pid.py
        perf script python: Add Python3 support to syscall-counts.py
        perf script python: Add Python3 support to stat-cpi.py
        perf script python: Add Python3 support to stackcollapse.py
        perf script python: Add Python3 support to sctop.py
        perf script python: Add Python3 support to powerpc-hcalls.py
        perf script python: Add Python3 support to net_dropmonitor.py
        perf script python: Add Python3 support to mem-phys-addr.py
        perf script python: Add Python3 support to failed-syscalls-by-pid.py
        perf script python: Add Python3 support to netdev-times.py
        perf tools: Add perf_exe() helper to find perf binary
        perf script: Handle missing fields with -F +..
        perf data: Add perf_data__open_dir_data function
        perf data: Add perf_data__(create_dir|close_dir) functions
        perf data: Fail check_backup in case of error
        perf data: Make check_backup work over directories
        perf tools: Add rm_rf_perf_data function
        perf tools: Add pattern name checking to rm_rf
        perf tools: Add depth checking to rm_rf
        perf data: Add global path holder
        ...
      203b6609
    • L
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3478588b
      Linus Torvalds 提交于
      Pull locking updates from Ingo Molnar:
       "The biggest part of this tree is the new auto-generated atomics API
        wrappers by Mark Rutland.
      
        The primary motivation was to allow instrumentation without uglifying
        the primary source code.
      
        The linecount increase comes from adding the auto-generated files to
        the Git space as well:
      
          include/asm-generic/atomic-instrumented.h     | 1689 ++++++++++++++++--
          include/asm-generic/atomic-long.h             | 1174 ++++++++++---
          include/linux/atomic-fallback.h               | 2295 +++++++++++++++++++++++++
          include/linux/atomic.h                        | 1241 +------------
      
        I preferred this approach, so that the full call stack of the (already
        complex) locking APIs is still fully visible in 'git grep'.
      
        But if this is excessive we could certainly hide them.
      
        There's a separate build-time mechanism to determine whether the
        headers are out of date (they should never be stale if we do our job
        right).
      
        Anyway, nothing from this should be visible to regular kernel
        developers.
      
        Other changes:
      
         - Add support for dynamic keys, which removes a source of false
           positives in the workqueue code, among other things (Bart Van
           Assche)
      
         - Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)
      
         - qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)
      
         - misc other updates and enhancements"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
        locking/lockdep: Shrink struct lock_class_key
        locking/lockdep: Add module_param to enable consistency checks
        lockdep/lib/tests: Test dynamic key registration
        lockdep/lib/tests: Fix run_tests.sh
        kernel/workqueue: Use dynamic lockdep keys for workqueues
        locking/lockdep: Add support for dynamic keys
        locking/lockdep: Verify whether lock objects are small enough to be used as class keys
        locking/lockdep: Check data structure consistency
        locking/lockdep: Reuse lock chains that have been freed
        locking/lockdep: Fix a comment in add_chain_cache()
        locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
        locking/lockdep: Reuse list entries that are no longer in use
        locking/lockdep: Free lock classes that are no longer in use
        locking/lockdep: Update two outdated comments
        locking/lockdep: Make it easy to detect whether or not inside a selftest
        locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
        locking/lockdep: Initialize the locks_before and locks_after lists earlier
        locking/lockdep: Make zap_class() remove all matching lock order entries
        locking/lockdep: Reorder struct lock_class members
        locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
        ...
      3478588b
    • L
      Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c8f5ed6e
      Linus Torvalds 提交于
      Pull EFI updates from Ingo Molnar:
       "The main EFI changes in this cycle were:
      
         - Use 32-bit alignment for efi_guid_t
      
         - Allow the SetVirtualAddressMap() call to be omitted
      
         - Implement earlycon=efifb based on existing earlyprintk code
      
         - Various minor fixes and code cleanups from Sai, Ard and me"
      
      * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Fix build error due to enum collision between efi.h and ima.h
        efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation
        x86: Make ARCH_USE_MEMREMAP_PROT a generic Kconfig symbol
        efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted
        efi: Replace GPL license boilerplate with SPDX headers
        efi/fdt: Apply more cleanups
        efi: Use 32-bit alignment for efi_guid_t
        efi/memattr: Don't bail on zero VA if it equals the region's PA
        x86/efi: Mark can_free_region() as an __init function
      c8f5ed6e
    • Y
      perf clang: Remove needless extra semicolon · a53837a5
      Yang Wei 提交于
      Delete a superfluous semicolon in getBPFObjectFromModule().
      Signed-off-by: NYang Wei <yang.wei9@zte.com.cn>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yang Wei <albin_yang@163.com>
      Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a53837a5
    • A
      perf bpf: Automatically add BTF ELF markers · 3163613c
      Arnaldo Carvalho de Melo 提交于
      The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
      place with the keys and values types of maps so that one can store the
      struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
      = BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
      BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
      MAP_ID'.
      
      Since we already have this for defining maps in 'perf trace' BPF events:
      
         bpf_map(name, _type, type_key, type_val, _max_entries)
      
      As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:
      
       --- 8< ---
      
      struct syscall {
              bool    enabled;
      };
      
      bpf_map(syscalls, ARRAY, int, struct syscall, 512);
      
       --- 8< ---
      
      All we need is to get all that already available info, piggyback on the
      'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
      'perf trace' BPF programs and do that without requiring changes to the
      BPF programs already defining maps using 'bpf_map()'.
      
      So this is what we have before this patch:
      
      1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
         so that we can use the .o later as a pre-compiled BPF bytecode:
      
        # grep '\[llvm\]' -A2 ~/.perfconfig
        [llvm]
      	dump-obj = true
      	clang-opt = -g
      
        #
        # clang --version
        clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
        Target: x86_64-unknown-linux-gnu
        Thread model: posix
        InstalledDir: /opt/llvm/bin
      
      2) Note the -g there so that we get clang to generate debuginfo, and
         since the target is 'bpf' it will generate the BTF info in this
         clang version (9.0).
      
      3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
         source code:
      
        # perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
        LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.025 MB perf.data ]
      
        # file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
      
      4) Look at the BTF structs encoded in it:
      
        # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        syscall_enter_args	64	0
        augmented_filename	264	0
        syscall	1	0
        syscall_exit_args	24	0
        bpf_map	28	0
        #
        # pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        # pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        struct syscall {
      	  bool                       enabled;              /*     0     1 */
      
      	  /* size: 1, cachelines: 1, members: 1 */
      	  /* last cacheline: 1 bytes */
        };
        #
      
      5) Ok, with just this we don't have the markers expected by the libbpf
         loader and when we run with this BPF bytecode, because we have:
      
        # grep '\[trace\]' -A1 ~/.perfconfig
        [trace]
      	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        #
      
      6) Lets do a 'perf trace' system wide session using this BPF program:
      
         # perf trace -e *mmsg,open*
        Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
        DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
        Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
        lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12
      
      7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
        loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):
      
        # bpftool map list | tail -6
        149: perf_event_array  name __augmented_sys  flags 0x0
      	  key 4B  value 4B  max_entries 8  memlock 4096B
        150: array  name syscalls  flags 0x0
      	  key 4B  value 1B  max_entries 512  memlock 8192B
        151: hash  name pids_filtered  flags 0x0
      	  key 4B  value 1B  max_entries 64  memlock 8192B
        #
      
      8) Dump the "pids_filtered", map, that will have one entry per PID that
         'perf trace' wants filtered, which includes its own, to avoid a
         tracing feedback loop (perf trace shows the syscalls it does which
         generates more syscalls that it has to show that...), it also
         auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
         same reason:
      
        # bpftool map dump id 151
        key: a5 0c 00 00  value: 01
        key: 14 63 00 00  value: 01
        Found 2 elements
        #
      
      9) Since there is no BTF info available, it does a generic hex dump :-\
      
      10) Now, with this patch applied, we'll do steps 3 to 6 again and look
          with pahole if there are extra structs encoded in BTF:
      
        # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        syscall_enter_args	64	0
        augmented_filename	264	0
        syscall	1	0
        syscall_exit_args	24	0
        bpf_map	28	0
        ____btf_map___augmented_syscalls__	8	0
        ____btf_map_syscalls	8	0
        ____btf_map_pids_filtered	8	0
        #
      
      11) Yes, those __btf_map_ + the map names, lets see how they look like:
      
        # pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        struct ____btf_map_syscalls {
      	  int                        key;                  /*     0     4 */
      	  struct syscall             value;                /*     4     1 */
      
      	  /* size: 8, cachelines: 1, members: 2 */
      	  /* padding: 3 */
      	  /* last cacheline: 8 bytes */
        };
        #
      
      12) Lets repeat step 7 to get the new map ids:
      
        # bpftool map list | tail -6
        155: perf_event_array  name __augmented_sys  flags 0x0
      	  key 4B  value 4B  max_entries 8  memlock 4096B
        156: array  name syscalls  flags 0x0
      	  key 4B  value 1B  max_entries 512  memlock 8192B
        157: hash  name pids_filtered  flags 0x0
      	  key 4B  value 1B  max_entries 64  memlock 8192B
        #
      
      13) And finally lets dump the 'pids_filtered':
      
        # bpftool map dump id 157
        [{
              "key": 3237,
              "value": true
          },{
              "key": 26435,
              "value": true
          }
        ]
        #
      
      Looks much better! BTF info was used to interpret the key as an integer
      and the value as a struct with just one boolean member, so to make it
      more compact, show just the 'true' value where we saw '01'.
      
      Now to make 'perf trace --dump-map' to use BTF!
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/n/tip-ybuf9wpkm30xk28iq7jbwb40@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3163613c
    • L
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3717f613
      Linus Torvalds 提交于
      Pull RCU updates from Ingo Molnar:
       "The main RCU related changes in this cycle were:
      
         - Additional cleanups after RCU flavor consolidation
      
         - Grace-period forward-progress cleanups and improvements
      
         - Documentation updates
      
         - Miscellaneous fixes
      
         - spin_is_locked() conversions to lockdep
      
         - SPDX changes to RCU source and header files
      
         - SRCU updates
      
         - Torture-test updates, including nolibc updates and moving nolibc to
           tools/include"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
        locking/locktorture: Convert to SPDX license identifier
        linux/torture: Convert to SPDX license identifier
        torture: Convert to SPDX license identifier
        linux/srcu: Convert to SPDX license identifier
        linux/rcutree: Convert to SPDX license identifier
        linux/rcutiny: Convert to SPDX license identifier
        linux/rcu_sync: Convert to SPDX license identifier
        linux/rcu_segcblist: Convert to SPDX license identifier
        linux/rcupdate: Convert to SPDX license identifier
        linux/rcu_node_tree: Convert to SPDX license identifier
        rcu/update: Convert to SPDX license identifier
        rcu/tree: Convert to SPDX license identifier
        rcu/tiny: Convert to SPDX license identifier
        rcu/sync: Convert to SPDX license identifier
        rcu/srcu: Convert to SPDX license identifier
        rcu/rcutorture: Convert to SPDX license identifier
        rcu/rcu_segcblist: Convert to SPDX license identifier
        rcu/rcuperf: Convert to SPDX license identifier
        rcu/rcu.h: Convert to SPDX license identifier
        RCU/torture.txt: Remove section MODULE PARAMETERS
        ...
      3717f613
    • L
      Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b1b988a6
      Linus Torvalds 提交于
      Pull year 2038 updates from Thomas Gleixner:
       "Another round of changes to make the kernel ready for 2038. After lots
        of preparatory work this is the first set of syscalls which are 2038
        safe:
      
          403 clock_gettime64
          404 clock_settime64
          405 clock_adjtime64
          406 clock_getres_time64
          407 clock_nanosleep_time64
          408 timer_gettime64
          409 timer_settime64
          410 timerfd_gettime64
          411 timerfd_settime64
          412 utimensat_time64
          413 pselect6_time64
          414 ppoll_time64
          416 io_pgetevents_time64
          417 recvmmsg_time64
          418 mq_timedsend_time64
          419 mq_timedreceiv_time64
          420 semtimedop_time64
          421 rt_sigtimedwait_time64
          422 futex_time64
          423 sched_rr_get_interval_time64
      
        The syscall numbers are identical all over the architectures"
      
      * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        riscv: Use latest system call ABI
        checksyscalls: fix up mq_timedreceive and stat exceptions
        unicore32: Fix __ARCH_WANT_STAT64 definition
        asm-generic: Make time32 syscall numbers optional
        asm-generic: Drop getrlimit and setrlimit syscalls from default list
        32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
        compat ABI: use non-compat openat and open_by_handle_at variants
        y2038: add 64-bit time_t syscalls to all 32-bit architectures
        y2038: rename old time and utime syscalls
        y2038: remove struct definition redirects
        y2038: use time32 syscall names on 32-bit
        syscalls: remove obsolete __IGNORE_ macros
        y2038: syscalls: rename y2038 compat syscalls
        x86/x32: use time64 versions of sigtimedwait and recvmmsg
        timex: change syscalls to use struct __kernel_timex
        timex: use __kernel_timex internally
        sparc64: add custom adjtimex/clock_adjtime functions
        time: fix sys_timer_settime prototype
        time: Add struct __kernel_timex
        time: make adjtime compat handling available for 32 bit
        ...
      b1b988a6
    • L
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · edaed168
      Linus Torvalds 提交于
      Pull x86/pti update from Thomas Gleixner:
       "Just a single change from the anti-performance departement:
      
         - Add a new PR_SPEC_DISABLE_NOEXEC option which allows to apply the
           speculation protections on a process without inheriting the state
           on exec.
      
           This remedies a situation where a Java-launcher has speculation
           protections enabled because that's the default for JVMs which
           causes the launched regular harmless processes to inherit the
           protection state which results in unintended performance
           degradation"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
      edaed168
    • L
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 78f86013
      Linus Torvalds 提交于
      Pull irq updates from Thomas Gleixner:
       "The interrupt departement delivers this time:
      
         - New infrastructure to manage NMIs on platforms which have a sane
           NMI delivery, i.e. identifiable NMI vectors instead of a single
           lump.
      
         - Simplification of the interrupt affinity management so drivers
           don't have to implement ugly loops around the PCI/MSI enablement.
      
         - Speedup for interrupt statistics in /proc/stat
      
         - Provide a function to retrieve the default irq domain
      
         - A new interrupt controller for the Loongson LS1X platform
      
         - Affinity support for the SiFive PLIC
      
         - Better support for the iMX irqsteer driver
      
         - NUMA aware memory allocations for GICv3
      
         - The usual small fixes, improvements and cleanups all over the
           place"
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        irqchip/imx-irqsteer: Add multi output interrupts support
        irqchip/imx-irqsteer: Change to use reg_num instead of irq_group
        dt-bindings: irq: imx-irqsteer: Add multi output interrupts support
        dt-binding: irq: imx-irqsteer: Use irq number instead of group number
        irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code
        irqchip/gicv3-its: Use NUMA aware memory allocation for ITS tables
        irqdomain: Allow the default irq domain to be retrieved
        irqchip/sifive-plic: Implement irq_set_affinity() for SMP host
        irqchip/sifive-plic: Differentiate between PLIC handler and context
        irqchip/sifive-plic: Add warning in plic_init() if handler already present
        irqchip/sifive-plic: Pre-compute context hart base and enable base
        PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets
        genirq/affinity: Remove the leftovers of the original set support
        nvme-pci: Simplify interrupt allocation
        genirq/affinity: Add new callback for (re)calculating interrupt sets
        genirq/affinity: Store interrupt sets size in struct irq_affinity
        genirq/affinity: Code consolidation
        irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid.
        irqchip/i8259: Fix shutdown order by moving syscore_ops registration
        dt-bindings: interrupt-controller: loongson ls1x intc
        ...
      78f86013
    • L
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 18483190
      Linus Torvalds 提交于
      Pull timer and clockevent updates from Thomas Gleixner:
       "The time(r) core and clockevent updates are mostly boring this time:
      
         - A new driver for the Tegra210 timer
      
         - Small fixes and improvements alll over the place
      
         - Documentation updates and cleanups"
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
        soc/tegra: default select TEGRA_TIMER for Tegra210
        clocksource/drivers/tegra: Add Tegra210 timer support
        dt-bindings: timer: add Tegra210 timer
        clocksource/drivers/timer-cs5535: Rename the file for consistency
        clocksource/drivers/timer-pxa: Rename the file for consistency
        clocksource/drivers/tango-xtal: Rename the file for consistency
        dt-bindings: timer: gpt: update binding doc
        clocksource/drivers/exynos_mct: Remove unused header includes
        dt-bindings: timer: mediatek: update bindings for MT7629 SoC
        clocksource/drivers/exynos_mct: Fix error path in timer resources initialization
        clocksource/drivers/exynos_mct: Remove dead code
        clocksource/drivers/riscv: Add required checks during clock source init
        dt-bindings: timer: renesas: tmu: Document r8a774c0 bindings
        dt-bindings: timer: renesas, cmt: Document r8a774c0 CMT support
        clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
        clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
        clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability
        clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable
        timers: Mark expected switch fall-throughs
        timekeeping/debug: No need to check return value of debugfs_create functions
        ...
      18483190
    • L
      Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · d9862cfb
      Linus Torvalds 提交于
      Pull MIPS updates from Paul Burton:
      
       - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
         (GINVT) instructions, allowing for more efficient TLB maintenance
         when running on a CPU such as the I6500 that supports these.
      
       - Enable huge page support for MIPS64r6.
      
       - Optimize post-DMA cache sync by removing that code entirely for
         kernel configurations in which we know it won't be needed.
      
       - The number of pages allocated for interrupt stacks is now calculated
         correctly, where before we would wastefully allocate too much memory
         in some configurations.
      
       - The ath79 platform migrates to devicetree.
      
       - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.
      
       - The ingenic/jz4740 platform gains support for appended devicetrees.
      
       - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
         cleanups as do various pieces of core architecture code.
      
      * tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits)
        MIPS: lantiq: Remove separate GPHY Firmware loader
        MIPS: ingenic: Add support for appended devicetree
        MIPS: SGI-IP27: rework HUB interrupts
        MIPS: SGI-IP27: do boot CPU init later
        MIPS: SGI-IP27: do xtalk scanning later
        MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output
        MIPS: SGI-IP27: clean up bridge access and header files
        MIPS: SGI-IP27: get rid of volatile and hubreg_t
        MIPS: irq: Allocate accurate order pages for irq stack
        MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()
        MIPS: eBPF: Remove REG_32BIT_ZERO_EX
        MIPS: eBPF: Always return sign extended 32b values
        MIPS: CM: Fix indentation
        MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support
        MIPS: OCTEON: program rx/tx-delay always from DT
        MIPS: OCTEON: delete board-specific link status
        MIPS: OCTEON: don't lie about interface type of CN3005 board
        MIPS: OCTEON: warn if deprecated link status is being used
        MIPS: OCTEON: add fixed-link nodes to in-kernel device tree
        MIPS: Delete unused flush_cache_sigtramp()
        ...
      d9862cfb
    • L
      Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 8feed3ef
      Linus Torvalds 提交于
      Pull parisc updates from Helge Deller:
       "The most important changes in this patch set are:
      
         - DMA-related cleanups for parisc with the aim to move anything not
           required by drivers out of <asm/dma-mapping.h>, by Christoph
           Hellwig
      
         - Switch to memblock_alloc(), by Mike Rapoport
      
         - Makefile cleanups by Masahiro Yamada
      
         - Switch to bust_spinlocks(), by Sergey Senozhatsky
      
         - Improved initial SMP affinity selection for IRQs
      
         - Added IPI- and rescheduling interrupts in /proc/interrupts output"
      
      * 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits)
        parisc: use memblock_alloc() instead of custom get_memblock()
        parisc: Add constants for various PDC firmware calls
        parisc: Add constant for PDC_PAT_COMPLEX firmware call
        parisc: Show machine product number during boot
        parisc: Add constants for PDC_RELOCATE PDC call
        parisc: Add PDC_CRASH_PREP PDC function number
        parisc: Use F_EXTEND() macro in iosapic code
        parisc: remove the HBA_DATA macro
        parisc/lba_pci: use container_of in LBA_DEV
        parisc/dino: use container_of in DINO_DEV
        parisc: properly type the return value of parisc_walk_tree
        parisc: properly type the iommu field in struct pci_hba_data
        parisc: turn GET_IOC into an inline function
        parisc: move internal implementation details out of <asm/dma-mapping.h>
        parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h>
        parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile
        parisc: replace oops_in_progress manipulation with bust_spinlocks()
        parisc: Improve initial IRQ to CPU assignment
        parisc: Count IPI function call interrupts
        parisc: Show rescheduling interrupts on SMP machines only
        ...
      8feed3ef
    • L
      Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3591b195
      Linus Torvalds 提交于
      Pull s390 updates from Martin Schwidefsky:
      
       - A copy of Arnds compat wrapper generation series
      
       - Pass information about the KVM guest to the host in form the control
         program code and the control program version code
      
       - Map IOV resources to support PCI physical functions on s390
      
       - Add vector load and store alignment hints to improve performance
      
       - Use the "jdd" constraint with gcc 9 to make jump labels working again
      
       - Remove amode workaround for old z/VM releases from the DCSS code
      
       - Add support for in-kernel performance measurements using the CPU
         measurement counter facility
      
       - Introduce a new PMU device cpum_cf_diag to capture counters and store
         thenn as event raw data.
      
       - Bug fixes and cleanups
      
      * tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
        Revert "s390/cpum_cf: Add kernel message exaplanations"
        s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
        s390/suspend: fix prefix register reset in swsusp_arch_resume
        s390: warn about clearing als implied facilities
        s390: allow overriding facilities via command line
        s390: clean up redundant facilities list setup
        s390/als: remove duplicated in-place implementation of stfle
        s390/cio: Use cpa range elsewhere within vfio-ccw
        s390/cio: Fix vfio-ccw handling of recursive TICs
        s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
        s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
        s390/cpum_cf: Add kernel message exaplanations
        s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
        s390/cpum_cf: add ctr_stcctm() function
        s390/cpum_cf: move common functions into a separate file
        s390/cpum_cf: introduce kernel_cpumcf_avail() function
        s390/cpu_mf: replace stcctm5() with the stcctm() function
        s390/cpu_mf: add store cpu counter multiple instruction support
        s390/cpum_cf: Add minimal in-kernel interface for counter measurements
        s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
        ...
      3591b195
    • L
      Merge tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 45f5532a
      Linus Torvalds 提交于
      Pull m68k updates from Geert Uytterhoeven:
      
       - VLA removal
      
       - gcc-8.x build fixes
      
       - small improvements and cleanups
      
       - defconfig updates
      
      * tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: Add -ffreestanding to CFLAGS
        m68k/apollo: Fix comment in Makefile
        dio: Fix buffer overflow in case of unknown board
        m68k/defconfig: Update defconfigs for v5.0-rc1
        m68k/atari: Avoid VLA use in atari_switches_setup()
        m68k: Avoid VLA use in mangle_kernel_stack()
        m68k/mac: Use '030 reset method on SE/30
        m68k/mac: Remove obsolete comment
        m68k/mac: Skip VIA port setup unless RTC is connected
        m68k/mac: Clean up unused timer definitions
        m68k/defconfig: Drop NET_VENDOR_<FOO>=n
      45f5532a
    • B
      x86: Deprecate a.out support · eac61655
      Borislav Petkov 提交于
      Linux supports ELF binaries for ~25 years now.  a.out coredumping has
      bitrotten quite significantly and would need some fixing to get it into
      shape again but considering how even the toolchains cannot create a.out
      executables in its default configuration, let's deprecate a.out support
      and remove it a couple of releases later, instead.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-fsdevel@vger.kernel.org>
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <x86@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eac61655
    • L
      a.out: remove core dumping support · 08300f44
      Linus Torvalds 提交于
      We're (finally) phasing out a.out support for good.  As Borislav Petkov
      points out, we've supported ELF binaries for about 25 years by now, and
      coredumping in particular has bitrotted over the years.
      
      None of the tool chains even support generating a.out binaries any more,
      and the plan is to deprecate a.out support entirely for the kernel.  But
      I want to start with just removing the core dumping code, because I can
      still imagine that somebody actually might want to support a.out as a
      simpler biinary format.
      
      Particularly if you generate some random binaries on the fly, ELF is a
      much more complicated format (admittedly ELF also does have a lot of
      toolchain support, mitigating that complexity a lot and you really
      should have moved over in the last 25 years).
      
      So it's at least somewhat possible that somebody out there has some
      workflow that still involves generating and running a.out executables.
      
      In contrast, it's very unlikely that anybody depends on debugging any
      legacy a.out core files.  But regardless, I want this phase-out to be
      done in two steps, so that we can resurrect a.out support (if needed)
      without having to resurrect the core file dumping that is almost
      certainly not needed.
      
      Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
      cut at this had missed.
      
      And Alan Cox points out that the a.out binary loader _could_ be done in
      user space if somebody wants to, but we might keep just the loader in
      the kernel if somebody really wants it, since the loader isn't that big
      and has no really odd special cases like the core dumping does.
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jannh@google.com>
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08300f44