1. 20 12月, 2018 1 次提交
  2. 16 7月, 2018 2 次提交
  3. 27 3月, 2018 4 次提交
  4. 17 3月, 2018 1 次提交
  5. 16 3月, 2018 1 次提交
  6. 12 3月, 2018 1 次提交
    • P
      perf/core: Remove perf_event::group_entry · 8343aae6
      Peter Zijlstra 提交于
      Now that all the grouping is done with RB trees, we no longer need
      group_entry and can replace the whole thing with sibling_list.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Valery Cherepennikov <valery.cherepennikov@intel.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8343aae6
  7. 19 1月, 2018 2 次提交
    • M
      powerpc/64: Change soft_enabled from flag to bitmask · 01417c6c
      Madhavan Srinivasan 提交于
      "paca->soft_enabled" is used as a flag to mask some of interrupts.
      Currently supported flags values and their details:
      
      soft_enabled    MSR[EE]
      
      0               0       Disabled (PMI and HMI not masked)
      1               1       Enabled
      
      "paca->soft_enabled" is initialized to 1 to make the interripts as
      enabled. arch_local_irq_disable() will toggle the value when
      interrupts needs to disbled. At this point, the interrupts are not
      actually disabled, instead, interrupt vector has code to check for the
      flag and mask it when it occurs. By "mask it", it update interrupt
      paca->irq_happened and return. arch_local_irq_restore() is called to
      re-enable interrupts, which checks and replays interrupts if any
      occured.
      
      Now, as mentioned, current logic doesnot mask "performance monitoring
      interrupts" and PMIs are implemented as NMI. But this patchset depends
      on local_irq_* for a successful local_* update. Meaning, mask all
      possible interrupts during local_* update and replay them after the
      update.
      
      So the idea here is to reserve the "paca->soft_enabled" logic. New
      values and details:
      
      soft_enabled    MSR[EE]
      
      1               0       Disabled  (PMI and HMI not masked)
      0               1       Enabled
      
      Reason for the this change is to create foundation for a third mask
      value "0x2" for "soft_enabled" to add support to mask PMIs. When
      ->soft_enabled is set to a value "3", PMI interrupts are mask and when
      set to a value of "1", PMI are not mask. With this patch also extends
      soft_enabled as interrupt disable mask.
      
      Current flags are renamed from IRQ_[EN?DIS}ABLED to
      IRQS_ENABLED and IRQS_DISABLED.
      
      Patch also fixes the ptrace call to force the user to see the softe
      value to be alway 1. Reason being, even though userspace has no
      business knowing about softe, it is part of pt_regs. Like-wise in
      signal context.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      01417c6c
    • M
      powerpc/64: Add #defines for paca->soft_enabled flags · c2e480ba
      Madhavan Srinivasan 提交于
      Two #defines IRQS_ENABLED and IRQS_DISABLED are added to be used when
      updating paca->soft_enabled. Replace the hardcoded values used when
      updating paca->soft_enabled with IRQ_(EN|DIS)ABLED #define. No logic
      change.
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c2e480ba
  8. 13 12月, 2017 1 次提交
    • R
      powerpc/perf: Dereference BHRB entries safely · f41d84dd
      Ravi Bangoria 提交于
      It's theoretically possible that branch instructions recorded in
      BHRB (Branch History Rolling Buffer) entries have already been
      unmapped before they are processed by the kernel. Hence, trying to
      dereference such memory location will result in a crash. eg:
      
          Unable to handle kernel paging request for data at address 0xd000000019c41764
          Faulting instruction address: 0xc000000000084a14
          NIP [c000000000084a14] branch_target+0x4/0x70
          LR [c0000000000eb828] record_and_restart+0x568/0x5c0
          Call Trace:
          [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable)
          [c0000000000ec378] perf_event_interrupt+0x298/0x460
          [c000000000027964] performance_monitor_exception+0x54/0x70
          [c000000000009ba4] performance_monitor_common+0x114/0x120
      
      Fix it by deferefencing the addresses safely.
      
      Fixes: 69123184 ("powerpc/perf: Fix setting of "to" addresses for BHRB")
      Cc: stable@vger.kernel.org # v3.10+
      Suggested-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      [mpe: Use probe_kernel_read() which is clearer, tweak change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f41d84dd
  9. 04 12月, 2017 1 次提交
    • R
      powerpc/perf: Fix oops when grouping different pmu events · 5aa04b3e
      Ravi Bangoria 提交于
      When user tries to group imc (In-Memory Collections) event with
      normal event, (sometime) kernel crashes with following log:
      
          Faulting instruction address: 0x00000000
          [link register   ] c00000000010ce88 power_check_constraints+0x128/0x980
          ...
          c00000000010e238 power_pmu_event_init+0x268/0x6f0
          c0000000002dc60c perf_try_init_event+0xdc/0x1a0
          c0000000002dce88 perf_event_alloc+0x7b8/0xac0
          c0000000002e92e0 SyS_perf_event_open+0x530/0xda0
          c00000000000b004 system_call+0x38/0xe0
      
      'event_base' field of 'struct hw_perf_event' is used as flags for
      normal hw events and used as memory address for imc events. While
      grouping these two types of events, collect_events() tries to
      interpret imc 'event_base' as a flag, which causes a corruption
      resulting in a crash.
      
      Consider only those events which belongs to 'perf_hw_context' in
      collect_events().
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-By: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5aa04b3e
  10. 20 9月, 2017 1 次提交
  11. 29 8月, 2017 1 次提交
    • K
      perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR · fc7ce9c7
      Kan Liang 提交于
      For understanding how the workload maps to memory channels and hardware
      behavior, it's very important to collect address maps with physical
      addresses. For example, 3D XPoint access can only be found by filtering
      the physical address.
      
      Add a new sample type for physical address.
      
      perf already has a facility to collect data virtual address. This patch
      introduces a function to convert the virtual address to physical address.
      The function is quite generic and can be extended to any architecture as
      long as a virtual address is provided.
      
       - For kernel direct mapping addresses, virt_to_phys is used to convert
         the virtual addresses to physical address.
      
       - For user virtual addresses, __get_user_pages_fast is used to walk the
         pages tables for user physical address.
      
       - This does not work for vmalloc addresses right now. These are not
         resolved, but code to do that could be added.
      
      The new sample type requires collecting the virtual address. The
      virtual address will not be output unless SAMPLE_ADDR is applied.
      
      For security, the physical address can only be exposed to root or
      privileged user.
      Tested-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: mpe@ellerman.id.au
      Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc7ce9c7
  12. 19 4月, 2017 2 次提交
  13. 09 3月, 2017 1 次提交
  14. 17 2月, 2017 2 次提交
  15. 18 1月, 2017 1 次提交
  16. 25 12月, 2016 1 次提交
  17. 13 9月, 2016 1 次提交
  18. 14 7月, 2016 1 次提交
  19. 14 6月, 2016 1 次提交
  20. 10 3月, 2016 1 次提交
  21. 13 9月, 2015 2 次提交
    • S
      perf/core: Drop PERF_EVENT_TXN · 8f3e5684
      Sukadev Bhattiprolu 提交于
      We currently use PERF_EVENT_TXN flag to determine if we are in the middle
      of a transaction. If in a transaction, we defer the schedulability checks
      from pmu->add() operation to the pmu->commit() operation.
      
      Now that we have "transaction types" (PERF_PMU_TXN_ADD, PERF_PMU_TXN_READ)
      we can use the type to determine if we are in a transaction and drop the
      PERF_EVENT_TXN flag.
      
      When PERF_EVENT_TXN is dropped, the cpuhw->group_flag on some architectures
      becomes unused, so drop that field as well.
      
      This is an extension of the Powerpc patch from Peter Zijlstra to s390,
      Sparc and x86 architectures.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1441336073-22750-11-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8f3e5684
    • S
      perf/core: Add a 'flags' parameter to the PMU transactional interfaces · fbbe0701
      Sukadev Bhattiprolu 提交于
      Currently, the PMU interface allows reading only one counter at a time.
      But some PMUs like the 24x7 counters in Power, support reading several
      counters at once. To leveage this functionality, extend the transaction
      interface to support a "transaction type".
      
      The first type, PERF_PMU_TXN_ADD, refers to the existing transactions,
      i.e. used to _schedule_ all the events on the PMU as a group. A second
      transaction type, PERF_PMU_TXN_READ, will be used in a follow-on patch,
      by the 24x7 counters to read several counters at once.
      
      Extend the transaction interfaces to the PMU to accept a 'txn_flags'
      parameter and use this parameter to ignore any transactions that are
      not of type PERF_PMU_TXN_ADD.
      
      Thanks to Peter Zijlstra for his input.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      [peterz: s390 compile fix]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1441336073-22750-3-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fbbe0701
  22. 27 7月, 2015 1 次提交
  23. 02 6月, 2015 1 次提交
    • A
      powerpc/perf: Fix book3s kernel to userspace backtraces · 72e349f1
      Anton Blanchard 提交于
      When we take a PMU exception or a software event we call
      perf_read_regs(). This overloads regs->result with a boolean that
      describes if we should use the sampled instruction address register
      (SIAR) or the regs.
      
      If the exception is in kernel, we start with the kernel regs and
      backtrace through the kernel stack. At this point we switch to the
      userspace regs and backtrace the user stack with perf_callchain_user().
      
      Unfortunately these regs have not got the perf_read_regs() treatment,
      so regs->result could be anything. If it is non zero,
      perf_instruction_pointer() decides to use the SIAR, and we get issues
      like this:
      
      0.11%  qemu-system-ppc  [kernel.kallsyms]        [k] _raw_spin_lock_irqsave
             |
             ---_raw_spin_lock_irqsave
                |
                |--52.35%-- 0
                |          |
                |          |--46.39%-- __hrtimer_start_range_ns
                |          |          kvmppc_run_core
                |          |          kvmppc_vcpu_run_hv
                |          |          kvmppc_vcpu_run
                |          |          kvm_arch_vcpu_ioctl_run
                |          |          kvm_vcpu_ioctl
                |          |          do_vfs_ioctl
                |          |          sys_ioctl
                |          |          system_call
                |          |          |
                |          |          |--67.08%-- _raw_spin_lock_irqsave <--- hi mum
                |          |          |          |
                |          |          |           --100.00%-- 0x7e714
                |          |          |                     0x7e714
      
      Notice the bogus _raw_spin_irqsave when we transition from kernel
      (system_call) to userspace (0x7e714). We inserted what was in the SIAR.
      
      Add a check in regs_use_siar() to check that the regs in question
      are from a PMU exception. With this fix the backtrace makes sense:
      
           0.47%  qemu-system-ppc  [kernel.vmlinux]         [k] _raw_spin_lock_irqsave
                  |
                  ---_raw_spin_lock_irqsave
                     |
                     |--53.83%-- 0
                     |          |
                     |          |--44.73%-- hrtimer_try_to_cancel
                     |          |          kvmppc_start_thread
                     |          |          kvmppc_run_core
                     |          |          kvmppc_vcpu_run_hv
                     |          |          kvmppc_vcpu_run
                     |          |          kvm_arch_vcpu_ioctl_run
                     |          |          kvm_vcpu_ioctl
                     |          |          do_vfs_ioctl
                     |          |          sys_ioctl
                     |          |          system_call
                     |          |          __ioctl
                     |          |          0x7e714
                     |          |          0x7e714
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      72e349f1
  24. 27 3月, 2015 1 次提交
    • J
      powerpc/perf: add missing put_cpu_var in power_pmu_event_init · 68de8867
      Jan Stancek 提交于
      One path in power_pmu_event_init() calls get_cpu_var(), but is
      missing matching call to put_cpu_var(), which causes preemption
      imbalance and crash in user-space:
      
        Page fault in user mode with in_atomic() = 1 mm = c000001fefa5a280
        NIP = 3fff9bf2cae0  MSR = 900000014280f032
        Oops: Weird page fault, sig: 11 [#23]
        SMP NR_CPUS=2048 NUMA PowerNV
        Modules linked in: <snip>
        CPU: 43 PID: 10285 Comm: a.out Tainted: G      D         4.0.0-rc5+ #1
        task: c000001fe82c9200 ti: c000001fe835c000 task.ti: c000001fe835c000
        NIP: 00003fff9bf2cae0 LR: 00003fff9bee4898 CTR: 00003fff9bf2cae0
        REGS: c000001fe835fea0 TRAP: 0401   Tainted: G      D          (4.0.0-rc5+)
        MSR: 900000014280f032 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI>  CR: 22000028  XER: 00000000
        CFAR: 00003fff9bee4894 SOFTE: 1
         GPR00: 00003fff9bee494c 00003fffe01c2ee0 00003fff9c084410 0000000010020068
         GPR04: 0000000000000000 0000000000000002 0000000000000008 0000000000000001
         GPR08: 0000000000000001 00003fff9c074a30 00003fff9bf2cae0 00003fff9bf2cd70
         GPR12: 0000000052000022 00003fff9c10b700
        NIP [00003fff9bf2cae0] 0x3fff9bf2cae0
        LR [00003fff9bee4898] 0x3fff9bee4898
        Call Trace:
        ---[ end trace 5d3d952b5d4185d4 ]---
      
        BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
        in_atomic(): 1, irqs_disabled(): 0, pid: 10285, name: a.out
        INFO: lockdep is turned off.
        CPU: 43 PID: 10285 Comm: a.out Tainted: G      D         4.0.0-rc5+ #1
        Call Trace:
        [c000001fe835f990] [c00000000089c014] .dump_stack+0x98/0xd4 (unreliable)
        [c000001fe835fa10] [c0000000000e4138] .___might_sleep+0x1d8/0x2e0
        [c000001fe835faa0] [c000000000888da8] .down_read+0x38/0x110
        [c000001fe835fb30] [c0000000000bf2f4] .exit_signals+0x24/0x160
        [c000001fe835fbc0] [c0000000000abde0] .do_exit+0xd0/0xe70
        [c000001fe835fcb0] [c00000000001f4c4] .die+0x304/0x450
        [c000001fe835fd60] [c00000000088e1f4] .do_page_fault+0x2d4/0x900
        [c000001fe835fe30] [c000000000008664] handle_page_fault+0x10/0x30
        note: a.out[10285] exited with preempt_count 1
      
      Reproducer:
        #include <stdio.h>
        #include <unistd.h>
        #include <syscall.h>
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <linux/perf_event.h>
        #include <linux/hw_breakpoint.h>
      
        static struct perf_event_attr event = {
                .type = PERF_TYPE_RAW,
                .size = sizeof(struct perf_event_attr),
                .sample_type = PERF_SAMPLE_BRANCH_STACK,
                .branch_sample_type = PERF_SAMPLE_BRANCH_ANY_RETURN,
        };
      
        int main()
        {
                syscall(__NR_perf_event_open, &event, 0, -1, -1, 0);
        }
      Signed-off-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      68de8867
  25. 19 2月, 2015 1 次提交
    • P
      perf, powerpc: Fix up flush_branch_stack() users · acba3c7e
      Peter Zijlstra 提交于
      The recent LBR rework for x86 left a stray flush_branch_stack() user in
      the PowerPC code, fix that up.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Joel Stanley <joel@jms.id.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      acba3c7e
  26. 03 11月, 2014 1 次提交
    • C
      powerpc: Replace __get_cpu_var uses · 69111bac
      Christoph Lameter 提交于
      This still has not been merged and now powerpc is the only arch that does
      not have this change. Sorry about missing linuxppc-dev before.
      
      V2->V2
        - Fix up to work against 3.18-rc1
      
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      [mpe: Fix build errors caused by set/or_softirq_pending(), and rework
            assignment in __set_breakpoint() to use memcpy().]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69111bac
  27. 25 9月, 2014 1 次提交
  28. 27 8月, 2014 2 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666
  29. 28 7月, 2014 3 次提交
    • M
      powerpc/perf: Add per-event excludes on Power8 · 9de5cb0f
      Michael Ellerman 提交于
      Power8 has a new register (MMCR2), which contains individual freeze bits
      for each counter. This is an improvement on previous chips as it means
      we can have multiple events on the PMU at the same time with different
      exclude_{user,kernel,hv} settings. Previously we had to ensure all
      events on the PMU had the same exclude settings.
      
      The core of the patch is fairly simple. We use the 207S feature flag to
      indicate that the PMU backend supports per-event excludes, if it's set
      we skip the generic logic that enforces the equality of excludes between
      events. We also use that flag to skip setting the freeze bits in MMCR0,
      the PMU backend is expected to have handled setting them in MMCR2.
      
      The complication arises with EBB. The FCxP bits in MMCR2 are accessible
      R/W to a task using EBB. Which means a task using EBB will be able to
      see that we are using MMCR2 for freezing, whereas the old logic which
      used MMCR0 is not user visible.
      
      The task can not see or affect exclude_kernel & exclude_hv, so we only
      need to consider exclude_user.
      
      The table below summarises the behaviour both before and after this
      commit is applied:
      
       exclude_user           true  false
       ------------------------------------
              | User visible |  N    N
       Before | Can freeze   |  Y    Y
              | Can unfreeze |  N    Y
       ------------------------------------
              | User visible |  Y    Y
        After | Can freeze   |  Y    Y
              | Can unfreeze |  Y/N  Y
       ------------------------------------
      
      So firstly I assert that the simple visibility of the exclude_user
      setting in MMCR2 is a non-issue. The event belongs to the task, and
      was most likely created by the task. So the exclude_user setting is not
      privileged information in any way.
      
      Secondly, the behaviour in the exclude_user = false case is unchanged.
      This is important as it is the case that is actually useful, ie. the
      event is created with no exclude setting and the task uses MMCR2 to
      implement exclusion manually.
      
      For exclude_user = true there is no meaningful change to freezing the
      event. Previously the task could use MMCR2 to freeze the event, though
      it was already frozen with MMCR0. With the new code the task can use
      MMCR2 to freeze the event, though it was already frozen with MMCR2.
      
      The only real change is when exclude_user = true and the task tries to
      use MMCR2 to unfreeze the event. Previously this had no effect, because
      the event was already frozen in MMCR0. With the new code the task can
      unfreeze the event in MMCR2, but at some indeterminate time in the
      future the kernel will overwrite its setting and refreeze the event.
      
      Therefore my final assertion is that any task using exclude_user = true
      and also fiddling with MMCR2 was deeply confused before this change, and
      remains so after it.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9de5cb0f
    • M
      powerpc/perf: Pass the struct perf_events down to compute_mmcr() · 8abd818f
      Michael Ellerman 提交于
      To support per-event exclude settings on Power8 we need access to the
      struct perf_events in compute_mmcr().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8abd818f
    • M
      powerpc/perf: Clear all MMCR settings before calling compute_mmcr() · 79a4cb28
      Michael Ellerman 提交于
      Because we reuse cpuhw->mmcr on each call to compute_mmcr() there's a
      risk that we could forget to set one of the values and use whatever
      value was in there previously.
      
      Currently all the implementations are careful to set all the values, but
      it's safer to clear them all before we call compute_mmcr().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      79a4cb28