1. 10 3月, 2016 7 次提交
    • M
      powerpc/perf: Fix misleading comment in pmao_restore_workaround() · 58bffb5b
      Madhavan Srinivasan 提交于
      The current comment in pmao_restore_workaround() regarding
      hard_irq_disable() is wrong. It should say to hard *disable* interrupts
      instead of *enable*. Fix it.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      58bffb5b
    • S
      powerpc/perf/24x7: Eliminate domain suffix in event names · 8f69dc70
      Sukadev Bhattiprolu 提交于
      The Physical Core events of the 24x7 PMU can be monitored across various
      domains (physical core, vcpu home core, vcpu home node etc). For each of
      these core events, we currently create multiple events in sysfs, one for
      each domain the event can be monitored in. These events are distinguished
      by their suffixes like __PHYS_CORE, __VCPU_HOME_CORE etc.
      
      Rather than creating multiple such entries, we could let the user specify
      make 'domain' index a required parameter and let the user specify a value
      for it (like they currently specify the core index).
      
      	$ cat /sys/bus/event_source/devices/hv_24x7/events/HPM_CCYC
      	domain=?,offset=0x98,core=?,lpar=0x0
      
      	$ perf stat -C 0 -e hv_24x7/HPM_CCYC,domain=2,core=1/ true
      
      (the 'domain=?' and 'core=?' in sysfs tell perf tool to enforce them as
      required parameters).
      
      This simplifies the interface and allows users to identify events by the
      name specified in the catalog (User can determine the domain index by
      referring to '/sys/bus/event_source/devices/hv_24x7/interface/domains').
      
      Eliminating the event suffix eliminates several functions and simplifies
      code.
      
      Note that Physical Chip events can only be monitored in the chip domain
      so those events have the domain set to 1 (rather than =?) and users don't
      need to specify the domain index for the Chip events.
      
      	$ cat /sys/bus/event_source/devices/hv_24x7/events/PM_XLINK_CYCLES
      	domain=1,offset=0x230,chip=?,lpar=0x0
      
      	$ perf stat -C 0 -e hv_24x7/PM_XLINK_CYCLES,chip=1/ true
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8f69dc70
    • S
      powerpc/perf/hv-24x7: Display domain indices in sysfs · d34171e8
      Sukadev Bhattiprolu 提交于
      To help users determine domains, display the domain indices used by the
      kernel in sysfs.
      
      	$ cat /sys/bus/event_source/devices/hv_24x7/interface/domains
      	1: Physical Chip
      	2: Physical Core
      	3: VCPU Home Core
      	4: VCPU Home Chip
      	5: VCPU Home Node
      	6: VCPU Remote Node
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d34171e8
    • S
      powerpc/perf/hv-24x7: Display change in counter values · 2b206ee6
      Sukadev Bhattiprolu 提交于
      For 24x7 counters, perf displays the raw value of the 24x7 counter, which
      is a monotonically increasing value.
      
      	perf stat -C 0 -e \
      		'hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/' \
      		sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
           9,105,403,170      hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/
      
             0.000425751 seconds time elapsed
      
      In the typical usage of 'perf stat' this counter value is not as useful
      as the _change_ in the counter value over the duration of the application.
      
      Have h_24x7_event_init() set the event's prev_count to the raw value of
      the 24x7 counter at the time of initialization. When the application
      terminates, hv_24x7_event_read() will compute the change in value and
      report to the perf tool. Similarly, for the transaction interface, clear
      the event count to 0 at the beginning of the transaction.
      
      	perf stat -C 0 -e \
      		'hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/' \
      		sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
                 245,758      hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/
      
             1.006366383 seconds time elapsed
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2b206ee6
    • S
      powerpc/perf/hv-24x7: Fix usage with chip events. · e5a5886d
      Sukadev Bhattiprolu 提交于
      24x7 counters can belong to different domains (core, chip, virtual CPU
      etc). For events in the 'chip' domain, sysfs entry currently looks like:
      
      	$ cd /sys/bus/event_source/devices/hv_24x7/events
      	$ cat PM_XLINK_CYCLES__PHYS_CHIP
      	domain=0x1,offset=0x230,core=?,lpar=0x0
      
      where the required parameter, 'core=?' is specified with perf as:
      
      	perf stat -C 0 -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,core=1/ \
      		/bin/true
      
      This is inconsistent in that 'core' is a required parameter for a chip
      event.  Instead, have the the sysfs entry display 'chip=?' for chip
      events:
      
      	$ cd /sys/bus/event_source/devices/hv_24x7/events
      	$ cat PM_XLINK_CYCLES__PHYS_CHIP
      	domain=0x1,offset=0x230,chip=?,lpar=0x0
      
      We also need to add a 'chip' entry in the sysfs format directory:
      
      	$ ls /sys/bus/event_source/devices/hv_24x7/format
      	chip  core  domain  lpar  offset  vcpu
      	^^^^
      	(new)
      
      so the perf tool can automatically check usage and format the chip
      parameter correctly:
      
      	$ perf stat -C 0 -v -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP/ \
      		/bin/true
      	Required parameter 'chip' not specified
      	invalid or unsupported event: 'hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP/'
      
      	$ perf stat -C 0 -v -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/ \
      		/bin/true
      	hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/: 0 6628908 6628908
      
      	 Performance counter stats for 'CPU(s) 0':
      
      	         0      hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/
      
      	    0.006606970 seconds time elapsed
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e5a5886d
    • S
      powerpc/perf: Export Power8 generic and cache events to sysfs · e0728b50
      Sukadev Bhattiprolu 提交于
      Power8 supports a large number of events in each susbystem so when a
      user runs:
      
      	perf stat -e branch-instructions sleep 1
      	perf stat -e L1-dcache-loads sleep 1
      
      it is not clear as to which PMU events were monitored.
      
      Export the generic hardware and cache perf events for Power8 to sysfs,
      so users can precisely determine the PMU event monitored by the generic
      event.
      
      Eg:
      	cat /sys/bus/event_source/devices/cpu/events/branch-instructions
      	event=0x10068
      
      	$ cat /sys/bus/event_source/devices/cpu/events/L1-dcache-loads
      	event=0x100ee
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e0728b50
    • S
      powerpc/perf: Remove PME_ prefix for power7 events · d4969e24
      Sukadev Bhattiprolu 提交于
      We used the PME_ prefix earlier to avoid some macro/variable name
      collisions.  We have since changed the way we define/use the event
      macros so we no longer need the prefix.
      
      By dropping the prefix, we keep the the event macros consistent with
      their official names.
      Reported-by: NMichael Ellerman <ellerman@au1.ibm.com>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d4969e24
  2. 09 3月, 2016 14 次提交
  3. 07 3月, 2016 8 次提交
    • T
      powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel · 8c50b72a
      Torsten Duwe 提交于
      Firstly we add logic to Kconfig to allow a user to choose if they want
      mprofile-kernel. This has to be user-selectable because only some
      current toolchains support it. If we enabled it unconditionally we would
      prevent some users from building the kernel entirely.
      
      Arguably it would be nice if we could detect if mprofile-kernel was
      available, and use it then. However that would violate the principle of
      least surprise because a user having choosen options such as live
      patching, would then see them quietly disabled at build time.
      
      We also make the user selectable option negative, ie. it disables when
      selected, so that allyesconfig continues to build on old toolchains.
      
      Once we've decided we do want to use mprofile-kernel, we then add a
      script which checks it actually works. That is because there are
      versions of gcc that accept the flag but don't generate correct code.
      
      Due to the way kconfig works, we can't error out when we detect a
      non-working toolchain. If we did a user would never be able to modify
      their config and run oldconfig - because the check would block oldconfig
      from running. Instead we emit a warning and add a bogus flag to CFLAGS
      so that the build will fail.
      Signed-off-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8c50b72a
    • T
      powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI · 15308664
      Torsten Duwe 提交于
      The gcc switch -mprofile-kernel defines a new ABI for calling _mcount()
      very early in the function with minimal overhead.
      
      Although mprofile-kernel has been available since GCC 3.4, there were
      bugs which were only fixed recently. Currently it is known to work in
      GCC 4.9, 5 and 6.
      
      Additionally there are two possible code sequences generated by the
      flag, the first uses mflr/std/bl and the second is optimised to omit the
      std. Currently only gcc 6 has the optimised sequence. This patch
      supports both sequences.
      
      Initial work started by Vojtech Pavlik, used with permission.
      
      Key changes:
       - rework _mcount() to work for both the old and new ABIs.
       - implement new versions of ftrace_caller() and ftrace_graph_caller()
         which deal with the new ABI.
       - updates to __ftrace_make_nop() to recognise the new mcount calling
         sequence.
       - updates to __ftrace_make_call() to recognise the nop'ed sequence.
       - implement ftrace_modify_call().
       - updates to the module loader to surpress the toc save in the module
         stub when calling mcount with the new ABI.
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      15308664
    • T
      powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftrace · 9a7841ae
      Torsten Duwe 提交于
      Rather than open-coding -pg whereever we want to disable ftrace, use the
      existing $(CC_FLAGS_FTRACE) variable.
      
      This has the advantage that it will work in future when we use a
      different set of flags to enable ftrace.
      Signed-off-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9a7841ae
    • T
      powerpc/ftrace: Use generic ftrace_modify_all_code() · c96f8385
      Torsten Duwe 提交于
      Convert powerpc's arch_ftrace_update_code() from its own version to use
      the generic default functionality (without stop_machine -- our
      instructions are properly aligned and the replacements atomic).
      
      With this we gain error checking and the much-needed function_trace_op
      handling.
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
      Signed-off-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c96f8385
    • M
      powerpc/module: Create a special stub for ftrace_caller() · 336a7b5d
      Michael Ellerman 提交于
      In order to support the new -mprofile-kernel ABI, we need to be able to
      call from the module back to ftrace_caller() (in the kernel) without
      using the module's r2. That is because the function in this module which
      is calling ftrace_caller() may not have setup r2, if it doesn't
      otherwise need it (ie. it accesses no globals).
      
      To make that work we add a new stub which is used for calling
      ftrace_caller(), which uses the kernel toc instead of the module toc.
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      336a7b5d
    • M
      powerpc/module: Mark module stubs with a magic value · f17c4e01
      Michael Ellerman 提交于
      When a module is loaded, calls out to the kernel go via a stub which is
      generated at runtime. One of these stubs is used to call _mcount(),
      which is the default target of tracing calls generated by the compiler
      with -pg.
      
      If dynamic ftrace is enabled (which it typically is), another stub is
      used to call ftrace_caller(), which is the target of tracing calls when
      ftrace is actually active.
      
      ftrace then wants to disable the calls to _mcount() at module startup,
      and enable/disable the calls to ftrace_caller() when enabling/disabling
      tracing - all of these it does by patching the code.
      
      As part of that code patching, the ftrace code wants to confirm that the
      branch it is about to modify, is in fact a call to a module stub which
      calls _mcount() or ftrace_caller().
      
      Currently it does that by inspecting the instructions and confirming
      they are what it expects. Although that works, the code to do it is
      pretty intricate because it requires lots of knowledge about the exact
      format of the stub.
      
      We can make that process easier by marking the generated stubs with a
      magic value, and then looking for that magic value. Altough this is not
      as rigorous as the current method, I believe it is sufficient in
      practice.
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f17c4e01
    • M
      powerpc/module: Only try to generate the ftrace_caller() stub once · 136cd345
      Michael Ellerman 提交于
      Currently we generate the module stub for ftrace_caller() at the bottom
      of apply_relocate_add(). However apply_relocate_add() is potentially
      called more than once per module, which means we will try to generate
      the ftrace_caller() stub multiple times.
      
      Although the current code deals with that correctly, ie. it only
      generates a stub the first time, it would be clearer to only try to
      generate the stub once.
      
      Note also on first reading it may appear that we generate a different
      stub for each section that requires relocation, but that is not the
      case. The code in stub_for_addr() that searches for an existing stub
      uses sechdrs[me->arch.stubs_section], ie. the single stub section for
      this module.
      
      A cleaner approach is to only generate the ftrace_caller() stub once,
      from module_finalize(). Although the original code didn't check to see
      if the stub was actually generated correctly, it seems prudent to add a
      check, so do that. And an additional benefit is we can clean the ifdefs
      up a little.
      
      Finally we must propagate the const'ness of some of the pointers passed
      to module_finalize(), but that is also an improvement.
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      136cd345
    • M
      powerpc: Create a helper for getting the kernel toc value · a5cab83c
      Michael Ellerman 提交于
      Move the logic to work out the kernel toc pointer into a header. This is
      a good cleanup, and also means we can use it elsewhere in future.
      Reviewed-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
      a5cab83c
  4. 03 3月, 2016 8 次提交
  5. 02 3月, 2016 3 次提交