1. 11 5月, 2018 1 次提交
  2. 10 5月, 2018 1 次提交
  3. 09 5月, 2018 1 次提交
  4. 04 5月, 2018 1 次提交
  5. 30 4月, 2018 2 次提交
  6. 29 4月, 2018 2 次提交
  7. 27 4月, 2018 2 次提交
  8. 25 4月, 2018 2 次提交
  9. 23 4月, 2018 1 次提交
  10. 20 4月, 2018 1 次提交
  11. 19 4月, 2018 2 次提交
  12. 17 4月, 2018 2 次提交
    • A
      perf/core: Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE] · 101592b4
      Alexey Budankov 提交于
      Store preempting context switch out event into Perf trace as a part of
      PERF_RECORD_SWITCH[_CPU_WIDE] record.
      
      Percentage of preempting and non-preempting context switches help
      understanding the nature of workloads (CPU or IO bound) that are running
      on a machine;
      
      The event is treated as preemption one when task->state value of the
      thread being switched out is TASK_RUNNING. Event type encoding is
      implemented using PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit;
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/9ff84e83-a0ca-dd82-a6d0-cb951689be74@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      101592b4
    • I
      tools/headers: Synchronize kernel ABI headers, v4.17-rc1 · e2f73a18
      Ingo Molnar 提交于
      Sync the following tooling headers with the latest kernel version:
      
        tools/arch/arm/include/uapi/asm/kvm.h
          - New ABI: KVM_REG_ARM_*
      
        tools/arch/x86/include/asm/required-features.h
          - Removal of NEED_LA57 dependency
      
        tools/arch/x86/include/uapi/asm/kvm.h
          - New KVM ABI: KVM_SYNC_X86_*
      
        tools/include/uapi/asm-generic/mman-common.h
          - New ABI: MAP_FIXED_NOREPLACE flag
      
        tools/include/uapi/linux/bpf.h
          - New ABI: BPF_F_SEQ_NUMBER functions
      
        tools/include/uapi/linux/if_link.h
          - New ABI: IFLA tun and rmnet support
      
        tools/include/uapi/linux/kvm.h
          - New ABI: hyperv eventfd and CONN_ID_MASK support plus header cleanups
      
        tools/include/uapi/sound/asound.h
          - New ABI: SNDRV_PCM_FORMAT_FIRST PCM format specifier
      
        tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
          - The x86 system call table description changed due to the ptregs changes and the renames, in:
      
      	d5a00528: syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
      	5ac9efa3: syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
      	ebeb8c82: syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
      
      Also fix the x86 syscall table warning:
      
        -Warning: Kernel ABI header at 'tools/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        +Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
      
      None of these changes impact existing tooling code, so we only have to copy the kernel version.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Brian Robbins <brianrob@microsoft.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com> <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Takuya Yamamoto <tkydevel@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lkml.kernel.org/r/20180416064024.ofjtrz5yuu3ykhvl@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e2f73a18
  13. 12 4月, 2018 2 次提交
  14. 06 4月, 2018 1 次提交
    • A
      tools headers uapi: Synchronize i915_drm.h · 01f97511
      Arnaldo Carvalho de Melo 提交于
      To pick up the changes in:
      
        c822e059 drm/i915: expose rcs topology through query uAPI
        a446ae2c drm/i915: add query uAPI
      
      This affects 'perf trace', that automagically gets the definition of the
      new I915_QUERY DRM ioctl:
      
        --- /tmp/build/perf/trace/beauty/generated/ioctl/drm_ioctl_array.c.old 2018-04-05 14:38:33.660111995 -0300
        +++ /tmp/build/perf/trace/beauty/generated/ioctl/drm_ioctl_array.c 2018-04-05 14:40:17.923283914 -0300
        @@ -158,4 +158,5 @@
                [DRM_COMMAND_BASE + 0x36] = "I915_PERF_OPEN",
                [DRM_COMMAND_BASE + 0x37] = "I915_PERF_ADD_CONFIG",
                [DRM_COMMAND_BASE + 0x38] = "I915_PERF_REMOVE_CONFIG",
        +       [DRM_COMMAND_BASE + 0x39] = "I915_QUERY",
         };
      
      I.e. on systems where this is used it will appear when, for instance,
      one does a system wide 'perf trace' session looking for ioctl calls,
      just like it does with the previously implemented DRM_I915 ioctls:
      
        # perf trace -e ioctl --filter-pids 2190
      <SNIP>
        4346.232 ( 0.012 ms): gnome-shell/1455 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7fff3b0cd910) = 0
        4346.246 ( 0.002 ms): gnome-shell/1455 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_MADVISE, arg: 0x7fff3b0cd980) = 0
        4346.252 ( 0.002 ms): gnome-shell/1455 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7fff3b0cdb00) = 0
      <SNIP>
      
      This silences this perf tools build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-5kxuvruuzdbojvf90f8j2wat@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      01f97511
  15. 03 4月, 2018 1 次提交
  16. 31 3月, 2018 4 次提交
    • A
      selftests/bpf: Selftest for sys_bind post-hooks. · 1d436885
      Andrey Ignatov 提交于
      Add selftest for attach types `BPF_CGROUP_INET4_POST_BIND` and
      `BPF_CGROUP_INET6_POST_BIND`.
      
      The main things tested are:
      * prog load behaves as expected (valid/invalid accesses in prog);
      * prog attach behaves as expected (load- vs attach-time attach types);
      * `BPF_CGROUP_INET_SOCK_CREATE` can be attached in a backward compatible
        way;
      * post-hooks return expected result and errno.
      
      Example:
        # ./test_sock
        Test case: bind4 load with invalid access: src_ip6 .. [PASS]
        Test case: bind4 load with invalid access: mark .. [PASS]
        Test case: bind6 load with invalid access: src_ip4 .. [PASS]
        Test case: sock_create load with invalid access: src_port .. [PASS]
        Test case: sock_create load w/o expected_attach_type (compat mode) ..
        [PASS]
        Test case: sock_create load w/ expected_attach_type .. [PASS]
        Test case: attach type mismatch bind4 vs bind6 .. [PASS]
        Test case: attach type mismatch bind6 vs bind4 .. [PASS]
        Test case: attach type mismatch default vs bind4 .. [PASS]
        Test case: attach type mismatch bind6 vs sock_create .. [PASS]
        Test case: bind4 reject all .. [PASS]
        Test case: bind6 reject all .. [PASS]
        Test case: bind6 deny specific IP & port .. [PASS]
        Test case: bind4 allow specific IP & port .. [PASS]
        Test case: bind4 allow all .. [PASS]
        Test case: bind6 allow all .. [PASS]
        Summary: 16 PASSED, 0 FAILED
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      1d436885
    • A
      selftests/bpf: Selftest for sys_connect hooks · 622adafb
      Andrey Ignatov 提交于
      Add selftest for BPF_CGROUP_INET4_CONNECT and BPF_CGROUP_INET6_CONNECT
      attach types.
      
      Try to connect(2) to specified IP:port and test that:
      * remote IP:port pair is overridden;
      * local end of connection is bound to specified IP.
      
      All combinations of IPv4/IPv6 and TCP/UDP are tested.
      
      Example:
        # tcpdump -pn -i lo -w connect.pcap 2>/dev/null &
        [1] 478
        # strace -qqf -e connect -o connect.trace ./test_sock_addr.sh
        Wait for testing IPv4/IPv6 to become available ... OK
        Load bind4 with invalid type (can pollute stderr) ... REJECTED
        Load bind4 with valid type ... OK
        Attach bind4 with invalid type ... REJECTED
        Attach bind4 with valid type ... OK
        Load connect4 with invalid type (can pollute stderr) libbpf: load bpf \
          program failed: Permission denied
        libbpf: -- BEGIN DUMP LOG ---
        libbpf:
        0: (b7) r2 = 23569
        1: (63) *(u32 *)(r1 +24) = r2
        2: (b7) r2 = 16777343
        3: (63) *(u32 *)(r1 +4) = r2
        invalid bpf_context access off=4 size=4
        [ 1518.404609] random: crng init done
      
        libbpf: -- END LOG --
        libbpf: failed to load program 'cgroup/connect4'
        libbpf: failed to load object './connect4_prog.o'
        ... REJECTED
        Load connect4 with valid type ... OK
        Attach connect4 with invalid type ... REJECTED
        Attach connect4 with valid type ... OK
        Test case #1 (IPv4/TCP):
                Requested: bind(192.168.1.254, 4040) ..
                   Actual: bind(127.0.0.1, 4444)
                Requested: connect(192.168.1.254, 4040) from (*, *) ..
                   Actual: connect(127.0.0.1, 4444) from (127.0.0.4, 56068)
        Test case #2 (IPv4/UDP):
                Requested: bind(192.168.1.254, 4040) ..
                   Actual: bind(127.0.0.1, 4444)
                Requested: connect(192.168.1.254, 4040) from (*, *) ..
                   Actual: connect(127.0.0.1, 4444) from (127.0.0.4, 56447)
        Load bind6 with invalid type (can pollute stderr) ... REJECTED
        Load bind6 with valid type ... OK
        Attach bind6 with invalid type ... REJECTED
        Attach bind6 with valid type ... OK
        Load connect6 with invalid type (can pollute stderr) libbpf: load bpf \
          program failed: Permission denied
        libbpf: -- BEGIN DUMP LOG ---
        libbpf:
        0: (b7) r6 = 0
        1: (63) *(u32 *)(r1 +12) = r6
        invalid bpf_context access off=12 size=4
      
        libbpf: -- END LOG --
        libbpf: failed to load program 'cgroup/connect6'
        libbpf: failed to load object './connect6_prog.o'
        ... REJECTED
        Load connect6 with valid type ... OK
        Attach connect6 with invalid type ... REJECTED
        Attach connect6 with valid type ... OK
        Test case #3 (IPv6/TCP):
                Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
                   Actual: bind(::1, 6666)
                Requested: connect(face:b00c:1234:5678::abcd, 6060) from (*, *)
                   Actual: connect(::1, 6666) from (::6, 37458)
        Test case #4 (IPv6/UDP):
                Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
                   Actual: bind(::1, 6666)
                Requested: connect(face:b00c:1234:5678::abcd, 6060) from (*, *)
                   Actual: connect(::1, 6666) from (::6, 39315)
        ### SUCCESS
        # egrep 'connect\(.*AF_INET' connect.trace | \
        > egrep -vw 'htons\(1025\)' | fold -b -s -w 72
        502   connect(7, {sa_family=AF_INET, sin_port=htons(4040),
        sin_addr=inet_addr("192.168.1.254")}, 128) = 0
        502   connect(8, {sa_family=AF_INET, sin_port=htons(4040),
        sin_addr=inet_addr("192.168.1.254")}, 128) = 0
        502   connect(9, {sa_family=AF_INET6, sin6_port=htons(6060),
        inet_pton(AF_INET6, "face:b00c:1234:5678::abcd", &sin6_addr),
        sin6_flowinfo=0, sin6_scope_id=0}, 128) = 0
        502   connect(10, {sa_family=AF_INET6, sin6_port=htons(6060),
        inet_pton(AF_INET6, "face:b00c:1234:5678::abcd", &sin6_addr),
        sin6_flowinfo=0, sin6_scope_id=0}, 128) = 0
        # fg
        tcpdump -pn -i lo -w connect.pcap 2> /dev/null
        # tcpdump -r connect.pcap -n tcp | cut -c 1-72
        reading from file connect.pcap, link-type EN10MB (Ethernet)
        17:57:40.383533 IP 127.0.0.4.56068 > 127.0.0.1.4444: Flags [S], seq 1333
        17:57:40.383566 IP 127.0.0.1.4444 > 127.0.0.4.56068: Flags [S.], seq 112
        17:57:40.383589 IP 127.0.0.4.56068 > 127.0.0.1.4444: Flags [.], ack 1, w
        17:57:40.384578 IP 127.0.0.1.4444 > 127.0.0.4.56068: Flags [R.], seq 1,
        17:57:40.403327 IP6 ::6.37458 > ::1.6666: Flags [S], seq 406513443, win
        17:57:40.403357 IP6 ::1.6666 > ::6.37458: Flags [S.], seq 2448389240, ac
        17:57:40.403376 IP6 ::6.37458 > ::1.6666: Flags [.], ack 1, win 342, opt
        17:57:40.404263 IP6 ::1.6666 > ::6.37458: Flags [R.], seq 1, ack 1, win
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      622adafb
    • A
      selftests/bpf: Selftest for sys_bind hooks · e50b0a6f
      Andrey Ignatov 提交于
      Add selftest to work with bpf_sock_addr context from
      `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` programs.
      
      Try to bind(2) on IP:port and apply:
      * loads to make sure context can be read correctly, including narrow
        loads (byte, half) for IP and full-size loads (word) for all fields;
      * stores to those fields allowed by verifier.
      
      All combination from IPv4/IPv6 and TCP/UDP are tested.
      
      Both scenarios are tested:
      * valid programs can be loaded and attached;
      * invalid programs can be neither loaded nor attached.
      
      Test passes when expected data can be read from context in the
      BPF-program, and after the call to bind(2) socket is bound to IP:port
      pair that was written by BPF-program to the context.
      
      Example:
        # ./test_sock_addr
        Attached bind4 program.
        Test case #1 (IPv4/TCP):
                Requested: bind(192.168.1.254, 4040) ..
                   Actual: bind(127.0.0.1, 4444)
        Test case #2 (IPv4/UDP):
                Requested: bind(192.168.1.254, 4040) ..
                   Actual: bind(127.0.0.1, 4444)
        Attached bind6 program.
        Test case #3 (IPv6/TCP):
                Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
                   Actual: bind(::1, 6666)
        Test case #4 (IPv6/UDP):
                Requested: bind(face:b00c:1234:5678::abcd, 6060) ..
                   Actual: bind(::1, 6666)
        ### SUCCESS
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      e50b0a6f
    • A
      libbpf: Support expected_attach_type at prog load · d7be143b
      Andrey Ignatov 提交于
      Support setting `expected_attach_type` at prog load time in both
      `bpf/bpf.h` and `bpf/libbpf.h`.
      
      Since both headers already have API to load programs, new functions are
      added not to break backward compatibility for existing ones:
      * `bpf_load_program_xattr()` is added to `bpf/bpf.h`;
      * `bpf_prog_load_xattr()` is added to `bpf/libbpf.h`.
      
      Both new functions accept structures, `struct bpf_load_program_attr` and
      `struct bpf_prog_load_attr` correspondingly, where new fields can be
      added in the future w/o changing the API.
      
      Standard `_xattr` suffix is used to name the new API functions.
      
      Since `bpf_load_program_name()` is not used as heavily as
      `bpf_load_program()`, it was removed in favor of more generic
      `bpf_load_program_xattr()`.
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d7be143b
  17. 29 3月, 2018 1 次提交
  18. 20 3月, 2018 4 次提交
  19. 17 3月, 2018 1 次提交
    • W
      KVM: X86: Provide a capability to disable MWAIT intercepts · 4d5422ce
      Wanpeng Li 提交于
      Allowing a guest to execute MWAIT without interception enables a guest
      to put a (physical) CPU into a power saving state, where it takes
      longer to return from than what may be desired by the host.
      
      Don't give a guest that power over a host by default. (Especially,
      since nothing prevents a guest from using MWAIT even when it is not
      advertised via CPUID.)
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Jan H. Schönherr <jschoenh@amazon.de>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4d5422ce
  20. 15 3月, 2018 1 次提交
  21. 13 3月, 2018 1 次提交
    • M
      perf/core: Implement fast breakpoint modification via _IOC_MODIFY_ATTRIBUTES · 32ff77e8
      Milind Chabbi 提交于
      Problem and motivation: Once a breakpoint perf event (PERF_TYPE_BREAKPOINT)
      is created, there is no flexibility to change the breakpoint type
      (bp_type), breakpoint address (bp_addr), or breakpoint length (bp_len). The
      only option is to close the perf event and configure a new breakpoint
      event. This inflexibility has a significant performance overhead. For
      example, sampling-based, lightweight performance profilers (and also
      concurrency bug detection tools),  monitor different addresses for a short
      duration using PERF_TYPE_BREAKPOINT and change the address (bp_addr) to
      another address or change the kind of breakpoint (bp_type) from  "write" to
      a "read" or vice-versa or change the length (bp_len) of the address being
      monitored. The cost of these modifications is prohibitive since it involves
      unmapping the circular buffer associated with the perf event, closing the
      perf event, opening another perf event and mmaping another circular buffer.
      
      Solution: The new ioctl flag for perf events,
      PERF_EVENT_IOC_MODIFY_ATTRIBUTES, introduced in this patch takes a pointer
      to a struct perf_event_attr as an argument to update an old breakpoint
      event with new address, type, and size. This facility allows retaining a
      previous mmaped perf events ring buffer and avoids having to close and
      reopen another perf event.
      
      This patch supports only changing PERF_TYPE_BREAKPOINT event type; future
      implementations can extend this feature. The patch replicates some of its
      functionality of modify_user_hw_breakpoint() in
      kernel/events/hw_breakpoint.c. modify_user_hw_breakpoint cannot be called
      directly since perf_event_ctx_lock() is already held in _perf_ioctl().
      
      Evidence: Experiments show that the baseline (not able to modify an already
      created breakpoint) costs an order of magnitude (~10x) more than the
      suggested optimization (having the ability to dynamically modifying a
      configured breakpoint via ioctl). When the breakpoints typically do not
      trap, the speedup due to the suggested optimization is ~10x; even when the
      breakpoints always trap, the speedup is ~4x due to the suggested
      optimization.
      
      Testing: tests posted at
      https://github.com/linux-contrib/perf_event_modify_bp demonstrate the
      performance significance of this patch. Tests also check the functional
      correctness of the patch.
      Signed-off-by: NMilind Chabbi <chabbi.milind@gmail.com>
      [ Using modify_user_hw_breakpoint_check function. ]
      [ Reformated PERF_EVENT_IOC_*, so the values are all in one column. ]
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oleg Nesterov <onestero@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/20180312134548.31532-8-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      32ff77e8
  22. 10 3月, 2018 1 次提交
  23. 08 3月, 2018 1 次提交
    • J
      perf tools: Add MEM_TOPOLOGY feature to perf data file · e2091ced
      Jiri Olsa 提交于
      Adding MEM_TOPOLOGY feature to perf data file,
      that will carry physical memory map and its
      node assignments.
      
      The format of data in MEM_TOPOLOGY is as follows:
      
        0 - version          | for future changes
        8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes
       16 - count            | number of nodes
      
       For each node we store map of physical indexes for
       each node:
      
       32 - node id          | node index
       40 - size             | size of bitmap
       48 - bitmap           | bitmap of memory indexes that belongs to node
                             | /sys/devices/system/node/node<NODE>/memory<INDEX>
      
      The MEM_TOPOLOGY could be displayed with following
      report command:
      
        $ perf report --header-only -I
        ...
        # memory nodes (nr 1, block size 0x8000000):
        #    0 [7G]: 0-23,32-69
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180307155020.32613-8-jolsa@kernel.org
      [ Rename 'index' to 'idx', as this breaks the build in rhel5, 6 and other systems where this is used by glibc headers ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e2091ced
  24. 05 3月, 2018 1 次提交
    • A
      tools headers: Sync copy of kvm UAPI headers · d976a6e9
      Arnaldo Carvalho de Melo 提交于
      In 801e459a ("KVM: x86: Add a framework for supporting MSR-based
      features") a new ioctl was introduced, which with this sync of the kvm
      UAPI headers, makes 'perf trace' know about it:
      
        $ cd /tmp/build/perf/trace/beauty/generated/ioctl/
        $ diff -u kvm_ioctl_array.c.old kvm_ioctl_array.c
        --- /tmp/kvm_ioctl_array.c	2018-03-05 11:55:38.409145056 -0300
        +++ /tmp/build/perf/trace/beauty/generated/ioctl/kvm_ioctl_array.c	2018-03-05 11:56:17.456153501 -0300
        @@ -6,6 +6,7 @@
       	[0x04] = "GET_VCPU_MMAP_SIZE",
       	[0x05] = "GET_SUPPORTED_CPUID",
       	[0x09] = "GET_EMULATED_CPUID",
        +	[0x0a] = "GET_MSR_FEATURE_INDEX_LIST",
       	[0x40] = "SET_MEMORY_REGION",
       	[0x41] = "CREATE_VCPU",
       	[0x42] = "GET_DIRTY_LOG",
      
      So when using 'perf trace -e ioctl' that will appear along with the
      others, like in this excerpt of a system wide session:
      
        14.556 ( 0.006 ms): CPU 0/KVM/16077 ioctl(fd: 19<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
        14.565 ( 0.006 ms): CPU 0/KVM/16077 ioctl(fd: 19<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
        14.573 (         ): CPU 0/KVM/16077 ioctl(fd: 19<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) ...
        34.075 ( 0.016 ms): gnome-shell/2192 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7ffe4e73e850) = 0
        40.549 ( 0.012 ms): gnome-shell/2192 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7ffe4e73ece0) = 0
        40.625 ( 0.005 ms): gnome-shell/2192 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7ffe4e73e940) = 0
        40.632 ( 0.003 ms): gnome-shell/2192 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_MADVISE, arg: 0x7ffe4e73e9b0) = 0
      
      This also silences the perf build header copy drift verifier:
      
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j4' parallel build
        Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-h31oz5g0mt1dh2s2ajq6o6no@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d976a6e9
  25. 15 2月, 2018 1 次提交
    • I
      tools/headers: Synchronize kernel ABI headers, v4.16-rc1 · f091f1d6
      Ingo Molnar 提交于
      Sync the following tooling headers with the latest kernel version:
      
        tools/arch/powerpc/include/uapi/asm/kvm.h
        tools/arch/x86/include/asm/cpufeatures.h
        tools/include/uapi/drm/i915_drm.h
        tools/include/uapi/linux/if_link.h
        tools/include/uapi/linux/kvm.h
      
      All the changes are new ABI additions which don't impact their use
      in existing tooling.
      
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f091f1d6
  26. 09 2月, 2018 1 次提交
  27. 07 2月, 2018 1 次提交
    • C
      lib: optimize cpumask_next_and() · 0ade34c3
      Clement Courbet 提交于
      We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
      It's essentially a joined iteration in search for a non-zero bit, which is
      currently implemented as a lookup join (find a nonzero bit on the lhs,
      lookup the rhs to see if it's set there).
      
      Implement a direct join (find a nonzero bit on the incrementally built
      join).  Also add generic bitmap benchmarks in the new `test_find_bit`
      module for new function (see `find_next_and_bit` in [2] and [3] below).
      
      For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
      faster with a geometric mean of 2.1 on 32 CPUs [1].  No impact on memory
      usage.  Note that on Arm, the new pure-C implementation still outperforms
      the old one that uses a mix of C and asm (`find_next_bit`) [3].
      
      [1] Approximate benchmark code:
      
      ```
        unsigned long src1p[nr_cpumask_longs] = {pattern1};
        unsigned long src2p[nr_cpumask_longs] = {pattern2};
        for (/*a bunch of repetitions*/) {
          for (int n = -1; n <= nr_cpu_ids; ++n) {
            asm volatile("" : "+rm"(src1p)); // prevent any optimization
            asm volatile("" : "+rm"(src2p));
            unsigned long result = cpumask_next_and(n, src1p, src2p);
            asm volatile("" : "+rm"(result));
          }
        }
      ```
      
      Results:
      pattern1    pattern2     time_before/time_after
      0x0000ffff  0x0000ffff   1.65
      0x0000ffff  0x00005555   2.24
      0x0000ffff  0x00001111   2.94
      0x0000ffff  0x00000000   14.0
      0x00005555  0x0000ffff   1.67
      0x00005555  0x00005555   1.71
      0x00005555  0x00001111   1.90
      0x00005555  0x00000000   6.58
      0x00001111  0x0000ffff   1.46
      0x00001111  0x00005555   1.49
      0x00001111  0x00001111   1.45
      0x00001111  0x00000000   3.10
      0x00000000  0x0000ffff   1.18
      0x00000000  0x00005555   1.18
      0x00000000  0x00001111   1.17
      0x00000000  0x00000000   1.25
      -----------------------------
                     geo.mean  2.06
      
      [2] test_find_next_bit, X86 (skylake)
      
       [ 3913.477422] Start testing find_bit() with random-filled bitmap
       [ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations
       [ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations
       [ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations
       [ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations
       [ 3913.480216] Start testing find_next_and_bit() with random-filled
       bitmap
       [ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations
       [ 3913.481075] Start testing find_bit() with sparse bitmap
       [ 3913.481078] find_next_bit: 2536 cycles, 66 iterations
       [ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations
       [ 3913.481255] find_last_bit: 2006 cycles, 66 iterations
       [ 3913.481265] find_first_bit: 17488 cycles, 66 iterations
       [ 3913.481266] Start testing find_next_and_bit() with sparse bitmap
       [ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations
      
      [3] test_find_next_bit, arm (v7 odroid XU3).
      
      [  267.206928] Start testing find_bit() with random-filled bitmap
      [  267.214752] find_next_bit: 4474 cycles, 16419 iterations
      [  267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations
      [  267.229294] find_last_bit: 4209 cycles, 16419 iterations
      [  267.279131] find_first_bit: 1032991 cycles, 16420 iterations
      [  267.286265] Start testing find_next_and_bit() with random-filled
      bitmap
      [  267.302386] find_next_and_bit: 2290 cycles, 8140 iterations
      [  267.309422] Start testing find_bit() with sparse bitmap
      [  267.316054] find_next_bit: 191 cycles, 66 iterations
      [  267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations
      [  267.329803] find_last_bit: 84 cycles, 66 iterations
      [  267.336169] find_first_bit: 4118 cycles, 66 iterations
      [  267.342627] Start testing find_next_and_bit() with sparse bitmap
      [  267.356919] find_next_and_bit: 91 cycles, 1 iterations
      
      [courbet@google.com: v6]
        Link: http://lkml.kernel.org/r/20171129095715.23430-1-courbet@google.com
      [geert@linux-m68k.org: m68k/bitops: always include <asm-generic/bitops/find.h>]
        Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
      Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.comSigned-off-by: NClement Courbet <courbet@google.com>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Yury Norov <ynorov@caviumnetworks.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ade34c3