1. 20 10月, 2015 12 次提交
    • I
      perf bench mem: Improve user visible strings · 13b1fdce
      Ingo Molnar 提交于
       - fix various typos in user visible output strings
       - make the output consistent (wrt. capitalization and spelling)
       - offer the list of routines to benchmark on '-r help'.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-11-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      13b1fdce
    • I
      perf bench mem: Fix 'length' vs. 'size' naming confusion · a69b4f74
      Ingo Molnar 提交于
      So 'perf bench mem memcpy/memset' consistently uses 'len' and 'length'
      for buffer sizes - while it's really a memory buffer size. (strings have
      length.)
      
      Rename all affected variables.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-10-git-send-email-mingo@kernel.org
      [ Update perf-bench man page ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a69b4f74
    • I
      perf bench mem: Rename 'routine' to 'routine_str' · e815e327
      Ingo Molnar 提交于
      So bench/mem-functions.c has a 'routine' name for the routines parameter
      string, but a 'length_str' name for the length parameter string.
      
      We also have another entity named 'routine': 'struct routine'.
      
      This is inconsistent and confusing: rename 'routine' to 'routine_str'.
      
      Also fix typos in the --routine help text.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-9-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e815e327
    • I
      perf bench mem: Change 'cycle' to 'cycles' · b14f2d35
      Ingo Molnar 提交于
      So 'perf bench mem memset/memcpy' has a CPU cycles measurement method,
      but calls it 'cycle' (singular) throughout the code, which makes it
      harder to read.
      
      Rename all related functions, variables and options to a plural 'cycles'
      nomenclature.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-8-git-send-email-mingo@kernel.org
      [ s/--cycle/--cycles/g in perf-bench man page ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b14f2d35
    • I
      perf bench: List output formatting options on 'perf bench -h' · 7a46a8fd
      Ingo Molnar 提交于
      So 'perf bench -h' is not very helpful when printing the help line
      about the output formatting options:
      
          -f, --format <default>
                                    Specify format style
      
      There are two output format styles, 'default' and 'simple', so improve
      the help text to:
      
          -f, --format <default|simple>
                                    Specify the output formatting style
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-7-git-send-email-mingo@kernel.org
      [ Removed leftovers from the mem-functions.c rename ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7a46a8fd
    • I
      perf bench: Remove the prefaulting complication from 'perf bench mem mem*' · 6db175c7
      Ingo Molnar 提交于
      So 'perf bench mem memcpy/memset' has elaborate code to measure
      memcpy()/memset() performance both with freshly allocated buffers (which
      includes initial page fault overhead) and with preallocated buffers.
      
      But the thing is, the resulting bandwidth results are mostly
      meaningless, because page faults dominate so much of the cost.
      
      It might make sense to measure cache cold vs. cache hot performance, but
      the code does not do this.
      
      So remove this complication, and always prefault the ranges before using
      them.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-6-git-send-email-mingo@kernel.org
      [ Remove --no-prefault, --only-prefault from docs, noticed by David Ahern ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6db175c7
    • I
      perf bench: Rename 'mem-memcpy.c' => 'mem-functions.c' · 9b2fa7f3
      Ingo Molnar 提交于
      So mem-memcpy.c started out as a simple memcpy() benchmark, then it grew
      memset() functionality and now I plan to add string copy benchmarks as
      well.
      
      This makes the file name a misnomer: rename it to the more generic
      mem-functions.c name.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-5-git-send-email-mingo@kernel.org
      [ The "rename" was introducing __unused, wasn't removing the old file,
        and didn't update tools/perf/bench/Build, fix it ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9b2fa7f3
    • I
      perf bench: Eliminate unused argument from bench_mem_common() · 2946f59a
      Ingo Molnar 提交于
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-4-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2946f59a
    • I
      perf bench: Default to all routines in 'perf bench mem' · 27619741
      Ingo Molnar 提交于
      So few people know that the --routine option to 'perf bench memcpy/memset'
      exists, and would not know that it's capable of testing the kernel's
      memcpy/memset implementations.
      
      Furthermore, 'perf bench mem all' will not run all routines:
      
      	vega:~> perf bench mem all
      	# Running mem/memcpy benchmark...
      	Routine default (Default memcpy() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	     894.454383 MB/Sec
      	       3.844734 GB/Sec (with prefault)
      
      	# Running mem/memset benchmark...
      	Routine default (Default memset() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	       1.220703 GB/Sec
      	       9.042245 GB/Sec (with prefault)
      
      Because misleadingly the 'all' refers to 'all sub-benchmarks', not 'all
      sub-benchmarks and routines'.
      
      Fix all this by making the memcpy/memset routine to default to 'all',
      which results in all the benchmarks being run:
      
      	triton:~> perf bench mem all
      	# Running mem/memcpy benchmark...
      	Routine default (Default memcpy() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	       1.448906 GB/Sec
      	       4.957170 GB/Sec (with prefault)
      	Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
      	# Copying 1MB Bytes ...
      
      	       1.614153 GB/Sec
      	       4.379204 GB/Sec (with prefault)
      	Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
      	# Copying 1MB Bytes ...
      
      	       1.570036 GB/Sec
      	       4.264465 GB/Sec (with prefault)
      	Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
      	# Copying 1MB Bytes ...
      
      	       1.788576 GB/Sec
      	       6.554111 GB/Sec (with prefault)
      
      	# Running mem/memset benchmark...
      	Routine default (Default memset() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	       2.082223 GB/Sec
      	       9.126752 GB/Sec (with prefault)
      	Routine x86-64-unrolled (unrolled memset() in arch/x86/lib/memset_64.S)
      	# Copying 1MB Bytes ...
      
      	       5.710892 GB/Sec
      	       8.346688 GB/Sec (with prefault)
      	Routine x86-64-stosq (movsq-based memset() in arch/x86/lib/memset_64.S)
      	# Copying 1MB Bytes ...
      
      	       9.765625 GB/Sec
      	      12.520032 GB/Sec (with prefault)
      	Routine x86-64-stosb (movsb-based memset() in arch/x86/lib/memset_64.S)
      	# Copying 1MB Bytes ...
      
      	       9.668936 GB/Sec
      	      12.682630 GB/Sec (with prefault)
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-3-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27619741
    • I
      perf bench: Improve the 'perf bench mem memcpy' code readability · 13839ec4
      Ingo Molnar 提交于
       - improve the readability of initializations
       - fix unnecessary double negations
       - fix ugly line breaks
       - fix other small details
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-2-git-send-email-mingo@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      13839ec4
    • N
      perf test: Suppress libtraceevent warnings · 2690c730
      Namhyung Kim 提交于
      Currently libtraceevent emits warning on unsupported event formats.
      However it'd be better to see them only -v option is given.  To do that,
      it needs to override the warning() function which is used in the
      libtracevent.  Thus add set_warning_routine() same as set_die_routine()
      and check the verbose flag in our warning routine.
      
      Before:
        # perf test 5
         5: parse events tests                                       :
          Warning: [kvmmmu:kvm_mmu_get_page] bad op token {
          Warning: [kvmmmu:kvm_mmu_sync_page] bad op token {
          Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token {
          Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token {
          Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined
          ...
         Ok
      
      After:
        # perf test 5
         5: parse events tests                                       : Ok
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1445268229-1601-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2690c730
    • N
      perf test: Silence tracepoint event failures · 87191383
      Namhyung Kim 提交于
      Currently, when 'perf test' is run by a normal user, it'll fail to
      access tracepoint events.  The output becomes somewhat messy because it
      tries to be nice with long error messages and hints.
      
      IMHO this is not needed for 'perf test' by default and AFAIK 'perf test'
      uses pr_debug() rather than pr_err() for such messages so that one can
      use -v option to see further details on failed testcases if needed.
      
      Before:
        $ perf test
         1: vmlinux symtab matches kallsyms                          : FAILED!
         2: detect openat syscall event                              :Error:
        No permissions to read
        /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
        Hint:	Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
        FAILED!
         3: detect openat syscall event on all cpus                  :Error:
        No permissions to read
        /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
        Hint:	Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
        FAILED!
         ...
      
      After:
        $ perf test
         1: vmlinux symtab matches kallsyms                          : FAILED!
         2: detect openat syscall event                              : FAILED!
         3: detect openat syscall event on all cpus                  : FAILED!
         ...
      
        $ perf test -v 2
         2: detect openat syscall event                              :
        --- start ---
        test child forked, pid 30575
        Error:	    No permissions to read
        /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
        Hint:  Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
      
        test child finished with -1
        ---- end ----
        detect openat syscall event: FAILED!
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1445268229-1601-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      87191383
  2. 14 10月, 2015 1 次提交
    • I
      Merge tag 'perf-core-for-mingo' of... · e9363dee
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Use the alternative with the most descriptive filename containing
          a vmlinux file for a given build-id, providing a better title line
          for tools such as 'annotate'. (Arnaldo Carvalho de Melo)
      
        - Remove help messages about previous right and left arrow keybidings, that
          were repurposed for horizontal scrolling. (Arnaldo Carvalho de Melo)
      
        - Inform how to reset the symbol filter in the hists browser. (top & report)
          (Arnaldo Carvalho de Melo)
      
        - Add 'm' key for context menu display in the hists browser, that became
          inacessible with the repurposing of the right arrow key for horizontal
          scrolling. (Namhyung Kim)
      
        - Use debug_frame for callchains if eh_frame is unusable. (Rabin Vicent)
      
      Build fixes:
      
        - Fix strict-aliasing breakage with gcc 4.4 in the READ_ONCE/WRITE_ONCE code
          adopted from the kernel tree, that builds with -fno-strict-aliasing while
          tools/perf/ uses -Wstrict-aliasing=3. (Jiri Olsa)
      
        - Fix unw_word_t pointer casts in code using libunwind for callchains,
          fixing the build in at least 32-bit MIPS systems. (Rabin Vicent)
      
        - Work around cross compile build problems related to fixdep. (Jiri Olsa)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e9363dee
  3. 13 10月, 2015 8 次提交
  4. 08 10月, 2015 4 次提交
    • I
      Merge tag 'perf-core-for-mingo' of... · 0e537fef
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Adding a field via 'perf report -F' that already is enabled makes
          the tool get stuck in a loop, fix it. (Jiri Olsa)
      
      Infrastructure changes:
      
        - Support PERF_RECORD_SWITCH in the python binding. (Arnaldo Carvalho de Melo)
      
        - Fix handling read() result using a signed variable, found with Coccinelle.
          (Andrzej Hajda)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0e537fef
    • I
      d3df65c1
    • A
      perf python: Support the PERF_RECORD_SWITCH event · ae938802
      Arnaldo Carvalho de Melo 提交于
      To test it check tools/perf/python/twatch.py, after following the
      instructions there to enable context_switch, output looks like:
      
        [root@zoo linux]# tools/perf/python/twatch.py
        cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 0 }
        cpu: 2, pid: 31463, tid: 31496 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31496, switch_out: 0 }
        cpu: 2, pid: 31463, tid: 31496 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31496, switch_out: 1 }
        cpu: 3, pid: 31463, tid: 31527 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31527, switch_out: 0 }
        cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 1 }
        cpu: 3, pid: 31463, tid: 31527 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31527, switch_out: 1 }
        cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 0 }
        ^CTraceback (most recent call last):
          File "tools/perf/python/twatch.py", line 67, in <module>
            main(context_switch = 1, thread = 31463)
          File "tools/perf/python/twatch.py", line 40, in main
            evlist.poll(timeout = -1)
        KeyboardInterrupt
        [root@zoo linux]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Guy Streeter <streeter@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-1ukistmpamc5z717k80ctcp2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ae938802
    • I
      Merge tag 'perf-urgent-for-mingo' of... · 00e6fa5f
      Ingo Molnar 提交于
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fix from Arnaldo Carvalho de Melo:
      
        - Fix build break on (at least) powerpc due to sample_reg_masks, not being
          available for linking. (Sukadev Bhattiprolu)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      00e6fa5f
  5. 07 10月, 2015 8 次提交
  6. 06 10月, 2015 7 次提交
    • T
      perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore · 712df65c
      Taku Izumi 提交于
      In multi-segment system, uncore devices may belong to buses whose segment
      number is other than 0:
      
        ....
        0000:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:7f:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:bf:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03
        ...
      
      In that case, relation of bus number and physical id may be broken
      because "uncore_pcibus_to_physid" doesn't take account of PCI segment.
      For example, bus 0000:ff and 0001:ff uses the same entry of
      "uncore_pcibus_to_physid" array.
      
      This patch fixes this problem by introducing the segment-aware pci2phy_map instead.
      Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1443096621-4119-1-git-send-email-izumi.taku@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      712df65c
    • K
      perf/x86: Add Intel cstate PMUs support · 7ce1346a
      Kan Liang 提交于
      This patch adds new PMUs to support cstate related free running
      (read-only) counters. These counters may be used simultaneously by other
      tools, such as turbostat. However, it still make sense to implement them
      in perf. Because we can conveniently collect them together with other
      events, and allow to use them from tools without special MSR access
      code.
      
      These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
      According to counters' scope and category, two PMUs are registered with
      the perf_event core subsystem.
      
       - 'cstate_core': The counter is available for each physical core. The
                        counters include CORE_C*_RESIDENCY.
      
       - 'cstate_pkg':  The counter is available for each physical package. The
                        counters include PKG_C*_RESIDENCY.
      
      The events are exposed in sysfs for use by perf stat and other tools.
      The files are:
      
        /sys/devices/cstate_core/events/c*-residency
        /sys/devices/cstate_pkg/events/c*-residency
      
      These events only support system-wide mode counting.
      The /sys/devices/cstate_*/cpumask file can be used by tools to figure
      out which CPUs to monitor by default.
      
      The PMU type (attr->type) is dynamically allocated and is available from
      /sys/devices/core_misc/type and /sys/device/cstate_*/type.
      
      Sampling is not supported.
      
      Here is an example.
      
       - To caculate the fraction of time when the core is running in C6 state
         CORE_C6_time% = CORE_C6_RESIDENCY / TSC
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5
      
         11838820015,,cstate_core/c6-residency/,5175919658,100.00
         11877130740,,msr/tsc/,5175922010,100.00
      
       For sleep, 99.7% of time we ran in C6 state.
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop
      
         1253316,,cstate_core/c6-residency/,4360969154,100.00
         10012635248,,msr/tsc/,4360972366,100.00
      
       For busyloop, 0.01% of time we ran in C6 state.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7ce1346a
    • L
      Merge tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · f6702681
      Linus Torvalds 提交于
      Pull xen bug fixes from David Vrabel:
      
       - Fix VM save performance regression with x86 PV guests
      
       - Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI)
      
       - Other minor fixes.
      
      * tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        x86/xen/p2m: hint at the last populated P2M entry
        x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
        x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
        xen/x86: Don't try to write syscall-related MSRs for PV guests
        xen: use correct type for HYPERVISOR_memory_op()
      f6702681
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3ec20e2e
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
       "Three bug fixes and an update to the default configuration"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/defconfig: set SCSI_DH=y
        s390/vtime: correct scaled cputime of partially idle CPUs
        s390/boot/decompression: disable floating point in decompressor
        s390/numa: use correct type for node_to_cpumask_map
      3ec20e2e
    • L
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · 3c68319b
      Linus Torvalds 提交于
      Pull CIFS fixes from Steve French:
       "Two fixes for problems pointed out by automated tools.
      
        Thanks PaX/grsecurity team and Dan Carpenter (and the Smatch tool)"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        [CIFS] Update cifs version number
        [SMB3] Do not fall back to SMBWriteX in set_file_size error cases
        [SMB3] Missing null tcon check
      3c68319b
    • D
      x86/xen/p2m: hint at the last populated P2M entry · 98dd166e
      David Vrabel 提交于
      With commit 633d6f17 (x86/xen: prepare
      p2m list for memory hotplug) the P2M may be sized to accomdate a much
      larger amount of memory than the domain currently has.
      
      When saving a domain, the toolstack must scan all the P2M looking for
      populated pages.  This results in a performance regression due to the
      unnecessary scanning.
      
      Instead of reporting (via shared_info) the maximum possible size of
      the P2M, hint at the last PFN which might be populated.  This hint is
      increased as new leaves are added to the P2M (in the expectation that
      they will be used for populated entries).
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      98dd166e
    • I
      Merge tag 'perf-core-for-mingo' of... · 1c748dc2
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Switch the default callchain output mode to 'graph,0.5,caller', to make it
          look like the default for other tools, reducing the learning curve for
          people used to 'caller' based viewing. (Arnaldo Carvalho de Melo)
      
        - Implement column based horizontal scrolling in the hists browser (top, report),
          making it possible to use the TUI for things like 'perf mem report' where
          there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo)
      
        - Support sorting by symbol_iaddr with perf.data files produced by
          'perf mem record'. (Don Zickus)
      
        - Display DATA_SRC sample type bit, i.e. when running 'perf evlist -v' the
          "DATA_SRC" wasn't appearing when set, fix it to look like: (Jiri Olsa)
      
            cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC
      
        - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.:
      
           $ perf record -e cycles:P usleep 1
      
          Is now similar to:
      
           $ perf record usleep 1
      
          Useful, for instance, when specifying multiple events. (Jiri Olsa)
      
        - Make 'perf -v' and 'perf -h' work. (Jiri Olsa)
      
        - Fail properly when pattern matching fails to find a tracepoint, i.e.
          '-e non:existent' was being correctly handled, with a proper error message
          about that not being a valid event, but '-e non:existent*' wasn't,
          fix it. (Jiri Olsa)
      
      Infrastructure changes:
      
        - Separate arch specific entries in 'perf test' and add an 'Intel CQM' one
          to be fun on x86 only. (Matt Fleming)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1c748dc2