1. 27 8月, 2020 1 次提交
  2. 20 8月, 2020 1 次提交
  3. 17 8月, 2020 1 次提交
    • A
      powerpc/perf: Add support for outputting extended regs in perf intr_regs · 781fa481
      Anju T Sudhakar 提交于
      Add support for perf extended register capability in powerpc. The
      capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the
      PMU which support extended registers. The generic code define the mask
      of extended registers as 0 for non supported architectures.
      
      Patch adds extended regs support for power9 platform by exposing
      MMCR0, MMCR1 and MMCR2 registers.
      
      REG_RESERVED mask needs update to include extended regs.
      PERF_REG_EXTENDED_MASK, contains mask value of the supported
      registers, is defined at runtime in the kernel based on platform since
      the supported registers may differ from one processor version to
      another and hence the MASK value.
      
      With the patch:
      
        available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11
        r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26
        r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe
        trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2
      
        PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0
        ... intr regs: mask 0xffffffffffff ABI 64-bit
        .... r0    0xc00000000012b77c
        .... r1    0xc000003fe5e03930
        .... r2    0xc000000001b0e000
        .... r3    0xc000003fdcddf800
        .... r4    0xc000003fc7880000
        .... r5    0x9c422724be
        .... r6    0xc000003fe5e03908
        .... r7    0xffffff63bddc8706
        .... r8    0x9e4
        .... r9    0x0
        .... r10   0x1
        .... r11   0x0
        .... r12   0xc0000000001299c0
        .... r13   0xc000003ffffc4800
        .... r14   0x0
        .... r15   0x7fffdd8b8b00
        .... r16   0x0
        .... r17   0x7fffdd8be6b8
        .... r18   0x7e7076607730
        .... r19   0x2f
        .... r20   0xc00000001fc26c68
        .... r21   0xc0002041e4227e00
        .... r22   0xc00000002018fb60
        .... r23   0x1
        .... r24   0xc000003ffec4d900
        .... r25   0x80000000
        .... r26   0x0
        .... r27   0x1
        .... r28   0x1
        .... r29   0xc000000001be1260
        .... r30   0x6008010
        .... r31   0xc000003ffebb7218
        .... nip   0xc00000000012b910
        .... msr   0x9000000000009033
        .... orig_r3 0xc00000000012b86c
        .... ctr   0xc0000000001299c0
        .... link  0xc00000000012b77c
        .... xer   0x0
        .... ccr   0x28002222
        .... softe 0x1
        .... trap  0xf00
        .... dar   0x0
        .... dsisr 0x80000000000
        .... sier  0x0
        .... mmcra 0x80000000000
        .... mmcr0 0x82008090
        .... mmcr1 0x1e000000
        .... mmcr2 0x0
         ... thread: perf:4784
      Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
      Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: NNageswara R Sastry <nasastry@in.ibm.com>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Reviewed-by: NKajol Jain <kjain@linux.ibm.com>
      Reviewed-and-tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1596794701-23530-2-git-send-email-atrajeev@linux.vnet.ibm.com
      781fa481
  4. 27 7月, 2020 1 次提交
    • N
      powerpc/64s/hash: Fix hash_preload running with interrupts enabled · 909adfc6
      Nicholas Piggin 提交于
      Commit 2f92447f ("powerpc/book3s64/hash: Use the pte_t address from the
      caller") removed the local_irq_disable from hash_preload, but it was
      required for more than just the page table walk: the hash pte busy bit is
      effectively a lock which may be taken in interrupt context, and the local
      update flag test must not be preempted before it's used.
      
      This solves apparent lockups with perf interrupting __hash_page_64K. If
      get_perf_callchain then also takes a hash fault on the same page while it
      is already locked, it will loop forever taking hash faults, which looks like
      this:
      
        cpu 0x49e: Vector: 100 (System Reset) at [c00000001a4f7d70]
            pc: c000000000072dc8: hash_page_mm+0x8/0x800
            lr: c00000000000c5a4: do_hash_page+0x24/0x38
            sp: c0002ac1cc69ac70
           msr: 8000000000081033
          current = 0xc0002ac1cc602e00
          paca    = 0xc00000001de1f280   irqmask: 0x03   irq_happened: 0x01
            pid   = 20118, comm = pread2_processe
        Linux version 5.8.0-rc6-00345-g1fad14f18bc6
        49e:mon> t
        [c0002ac1cc69ac70] c00000000000c5a4 do_hash_page+0x24/0x38 (unreliable)
        --- Exception: 300 (Data Access) at c00000000008fa60 __copy_tofrom_user_power7+0x20c/0x7ac
        [link register   ] c000000000335d10 copy_from_user_nofault+0xf0/0x150
        [c0002ac1cc69af70] c00032bf9fa3c880 (unreliable)
        [c0002ac1cc69afa0] c000000000109df0 read_user_stack_64+0x70/0xf0
        [c0002ac1cc69afd0] c000000000109fcc perf_callchain_user_64+0x15c/0x410
        [c0002ac1cc69b060] c000000000109c00 perf_callchain_user+0x20/0x40
        [c0002ac1cc69b080] c00000000031c6cc get_perf_callchain+0x25c/0x360
        [c0002ac1cc69b120] c000000000316b50 perf_callchain+0x70/0xa0
        [c0002ac1cc69b140] c000000000316ddc perf_prepare_sample+0x25c/0x790
        [c0002ac1cc69b1a0] c000000000317350 perf_event_output_forward+0x40/0xb0
        [c0002ac1cc69b220] c000000000306138 __perf_event_overflow+0x88/0x1a0
        [c0002ac1cc69b270] c00000000010cf70 record_and_restart+0x230/0x750
        [c0002ac1cc69b620] c00000000010d69c perf_event_interrupt+0x20c/0x510
        [c0002ac1cc69b730] c000000000027d9c performance_monitor_exception+0x4c/0x60
        [c0002ac1cc69b750] c00000000000b2f8 performance_monitor_common_virt+0x1b8/0x1c0
        --- Exception: f00 (Performance Monitor) at c0000000000cb5b0 pSeries_lpar_hpte_insert+0x0/0x160
        [link register   ] c0000000000846f0 __hash_page_64K+0x210/0x540
        [c0002ac1cc69ba50] 0000000000000000 (unreliable)
        [c0002ac1cc69bb00] c000000000073ae0 update_mmu_cache+0x390/0x3a0
        [c0002ac1cc69bb70] c00000000037f024 wp_page_copy+0x364/0xce0
        [c0002ac1cc69bc20] c00000000038272c do_wp_page+0xdc/0xa60
        [c0002ac1cc69bc70] c0000000003857bc handle_mm_fault+0xb9c/0x1b60
        [c0002ac1cc69bd50] c00000000006c434 __do_page_fault+0x314/0xc90
        [c0002ac1cc69be20] c00000000000c5c8 handle_page_fault+0x10/0x2c
        --- Exception: 300 (Data Access) at 00007fff8c861fe8
        SP (7ffff6b19660) is in userspace
      
      Fixes: 2f92447f ("powerpc/book3s64/hash: Use the pte_t address from the caller")
      Reported-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Reported-by: NAnton Blanchard <anton@ozlabs.org>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200727060947.10060-1-npiggin@gmail.com
      909adfc6
  5. 22 7月, 2020 6 次提交
  6. 21 7月, 2020 1 次提交
  7. 18 6月, 2020 2 次提交
  8. 18 5月, 2020 1 次提交
    • J
      powerpc: Use a datatype for instructions · 94afd069
      Jordan Niethe 提交于
      Currently unsigned ints are used to represent instructions on powerpc.
      This has worked well as instructions have always been 4 byte words.
      
      However, ISA v3.1 introduces some changes to instructions that mean
      this scheme will no longer work as well. This change is Prefixed
      Instructions. A prefixed instruction is made up of a word prefix
      followed by a word suffix to make an 8 byte double word instruction.
      No matter the endianness of the system the prefix always comes first.
      Prefixed instructions are only planned for powerpc64.
      
      Introduce a ppc_inst type to represent both prefixed and word
      instructions on powerpc64 while keeping it possible to exclusively
      have word instructions on powerpc32.
      Signed-off-by: NJordan Niethe <jniethe5@gmail.com>
      [mpe: Fix compile error in emulate_spe()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200506034050.24806-12-jniethe5@gmail.com
      94afd069
  9. 11 2月, 2020 1 次提交
    • K
      perf/core: Add new branch sample type for HW index of raw branch records · bbfd5e4f
      Kan Liang 提交于
      The low level index is the index in the underlying hardware buffer of
      the most recently captured taken branch which is always saved in
      branch_entries[0]. It is very useful for reconstructing the call stack.
      For example, in Intel LBR call stack mode, the depth of reconstructed
      LBR call stack limits to the number of LBR registers. With the low level
      index information, perf tool may stitch the stacks of two samples. The
      reconstructed LBR call stack can break the HW limitation.
      
      Add a new branch sample type to retrieve low level index of raw branch
      records. The low level index is between -1 (unknown) and max depth which
      can be retrieved in /sys/devices/cpu/caps/branches.
      
      Only when the new branch sample type is set, the low level index
      information is dumped into the PERF_SAMPLE_BRANCH_STACK output.
      Perf tool should check the attr.branch_sample_type, and apply the
      corresponding format for PERF_SAMPLE_BRANCH_STACK samples.
      Otherwise, some user case may be broken. For example, users may parse a
      perf.data, which include the new branch sample type, with an old version
      perf tool (without the check). Users probably get incorrect information
      without any warning.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lkml.kernel.org/r/20200127165355.27495-2-kan.liang@linux.intel.com
      bbfd5e4f
  10. 25 1月, 2020 1 次提交
  11. 18 10月, 2019 1 次提交
    • J
      perf_event: Add support for LSM and SELinux checks · da97e184
      Joel Fernandes (Google) 提交于
      In current mainline, the degree of access to perf_event_open(2) system
      call depends on the perf_event_paranoid sysctl.  This has a number of
      limitations:
      
      1. The sysctl is only a single value. Many types of accesses are controlled
         based on the single value thus making the control very limited and
         coarse grained.
      2. The sysctl is global, so if the sysctl is changed, then that means
         all processes get access to perf_event_open(2) opening the door to
         security issues.
      
      This patch adds LSM and SELinux access checking which will be used in
      Android to access perf_event_open(2) for the purposes of attaching BPF
      programs to tracepoints, perf profiling and other operations from
      userspace. These operations are intended for production systems.
      
      5 new LSM hooks are added:
      1. perf_event_open: This controls access during the perf_event_open(2)
         syscall itself. The hook is called from all the places that the
         perf_event_paranoid sysctl is checked to keep it consistent with the
         systctl. The hook gets passed a 'type' argument which controls CPU,
         kernel and tracepoint accesses (in this context, CPU, kernel and
         tracepoint have the same semantics as the perf_event_paranoid sysctl).
         Additionally, I added an 'open' type which is similar to
         perf_event_paranoid sysctl == 3 patch carried in Android and several other
         distros but was rejected in mainline [1] in 2016.
      
      2. perf_event_alloc: This allocates a new security object for the event
         which stores the current SID within the event. It will be useful when
         the perf event's FD is passed through IPC to another process which may
         try to read the FD. Appropriate security checks will limit access.
      
      3. perf_event_free: Called when the event is closed.
      
      4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
      
      5. perf_event_write: Called from the ioctl(2) syscalls for the event.
      
      [1] https://lwn.net/Articles/696240/
      
      Since Peter had suggest LSM hooks in 2016 [1], I am adding his
      Suggested-by tag below.
      
      To use this patch, we set the perf_event_paranoid sysctl to -1 and then
      apply selinux checking as appropriate (default deny everything, and then
      add policy rules to give access to domains that need it). In the future
      we can remove the perf_event_paranoid sysctl altogether.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Co-developed-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NJames Morris <jmorris@namei.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: rostedt@goodmis.org
      Cc: Yonghong Song <yhs@fb.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: jeffv@google.com
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: primiano@google.com
      Cc: Song Liu <songliubraving@fb.com>
      Cc: rsavitski@google.com
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Matthew Garrett <matthewgarrett@google.com>
      Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
      da97e184
  12. 31 5月, 2019 1 次提交
  13. 22 5月, 2019 1 次提交
    • R
      powerpc/perf: Fix MMCRA corruption by bhrb_filter · 3202e35e
      Ravi Bangoria 提交于
      Consider a scenario where user creates two events:
      
        1st event:
          attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
          attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY;
          fd = perf_event_open(attr, 0, 1, -1, 0);
      
        This sets cpuhw->bhrb_filter to 0 and returns valid fd.
      
        2nd event:
          attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
          attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL;
          fd = perf_event_open(attr, 0, 1, -1, 0);
      
        It overrides cpuhw->bhrb_filter to -1 and returns with error.
      
      Now if power_pmu_enable() gets called by any path other than
      power_pmu_add(), ppmu->config_bhrb(-1) will set MMCRA to -1.
      
      Fixes: 3925f46b ("powerpc/perf: Enable branch stack sampling framework")
      Cc: stable@vger.kernel.org # v3.10+
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3202e35e
  14. 03 5月, 2019 2 次提交
    • M
      powerpc/perf: Add generic compat mode pmu driver · be80e758
      Madhavan Srinivasan 提交于
      Most of the power processor generation performance monitoring
      unit (PMU) driver code is bundled in the kernel and one of those
      is enabled/registered based on the oprofile_cpu_type check at
      the boot.
      
      But things get little tricky incase of "compat" mode boot.
      IBM POWER System Server based processors has a compactibility
      mode feature, which simpily put is, Nth generation processor
      (lets say POWER8) will act and appear in a mode consistent
      with an earlier generation (N-1) processor (that is POWER7).
      And in this "compat" mode boot, kernel modify the
      "oprofile_cpu_type" to be Nth generation (POWER8). If Nth
      generation pmu driver is bundled (POWER8), it gets registered.
      
      Key dependency here is to have distro support for latest
      processor performance monitoring support. Patch here adds
      a generic "compat-mode" performance monitoring driver to
      be register in absence of powernv platform specific pmu driver.
      
      Driver supports only "cycles" and "instruction" events.
      "0x0001e" used as event code for "cycles" and "0x00002"
      used as event code for "instruction" events. New file
      called "generic-compat-pmu.c" is created to contain the driver
      specific code. And base raw event code format modeled
      on PPMU_ARCH_207S.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      [mpe: Use SPDX tag for license]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      be80e758
    • M
      powerpc/perf: init pmu from core-book3s · 708597da
      Madhavan Srinivasan 提交于
      Currenty pmu driver file for each ppc64 generation processor
      has a __init call in itself. Refactor the code by moving the
      __init call to core-books.c. This also clean's up compat mode
      pmu driver registration.
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      [mpe: Use SPDX tag for license]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      708597da
  15. 21 12月, 2018 1 次提交
  16. 20 12月, 2018 2 次提交
    • M
      powerpc/perf: Add constraints for power9 l2/l3 bus events · 59029136
      Madhavan Srinivasan 提交于
      In previous generation processors, both bus events and direct
      events of performance monitoring unit can be individually
      programmabled and monitored in PMCs.
      
      But in Power9, L2/L3 bus events are always available as a
      "bank" of 4 events. To obtain the counts for any of the
      l2/l3 bus events in a given bank, the user will have to
      program PMC4 with corresponding l2/l3 bus event for that
      bank.
      
      Patch enforce two contraints incase of L2/L3 bus events.
      
      1)Any L2/L3 event when programmed is also expected to program corresponding
      PMC4 event from that group.
      2)PMC4 event should always been programmed first due to group constraint
      logic limitation
      
      For ex. consider these L3 bus events
      
      PM_L3_PF_ON_CHIP_MEM (0x460A0),
      PM_L3_PF_MISS_L3 (0x160A0),
      PM_L3_CO_MEM (0x260A0),
      PM_L3_PF_ON_CHIP_CACHE (0x360A0),
      
      1) This is an INVALID group for L3 Bus event monitoring,
      since it is missing PMC4 event.
      	perf stat -e "{r160A0,r260A0,r360A0}" < >
      
      And this is a VALID group for L3 Bus events:
      	perf stat -e "{r460A0,r160A0,r260A0,r360A0}" < >
      
      2) This is an INVALID group for L3 Bus event monitoring,
      since it is missing PMC4 event.
      	perf stat -e "{r260A0,r360A0}" < >
      
      And this is a VALID group for L3 Bus events:
      	perf stat -e "{r460A0,r260A0,r360A0}" < >
      
      3) This is an INVALID group for L3 Bus event monitoring,
      since it is missing PMC4 event.
      	perf stat -e "{r360A0}" < >
      
      And this is a VALID group for L3 Bus events:
      	perf stat -e "{r460A0,r360A0}" < >
      
      Patch here implements group constraint logic suggested by Michael Ellerman.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      59029136
    • M
      powerpc/perf: Update perf_regs structure to include SIER · 333804dc
      Madhavan Srinivasan 提交于
      On each sample, Sample Instruction Event Register (SIER) content
      is saved in pt_regs. SIER does not have a entry as-is in the pt_regs
      but instead, SIER content is saved in the "dar" register of pt_regs.
      
      Patch adds another entry to the perf_regs structure to include the "SIER"
      printing which internally maps to the "dar" of pt_regs.
      
      It also check for the SIER availability in the platform and present
      value accordingly
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      333804dc
  17. 16 7月, 2018 2 次提交
  18. 27 3月, 2018 4 次提交
  19. 17 3月, 2018 1 次提交
  20. 16 3月, 2018 1 次提交
  21. 12 3月, 2018 1 次提交
    • P
      perf/core: Remove perf_event::group_entry · 8343aae6
      Peter Zijlstra 提交于
      Now that all the grouping is done with RB trees, we no longer need
      group_entry and can replace the whole thing with sibling_list.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Valery Cherepennikov <valery.cherepennikov@intel.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8343aae6
  22. 19 1月, 2018 2 次提交
    • M
      powerpc/64: Change soft_enabled from flag to bitmask · 01417c6c
      Madhavan Srinivasan 提交于
      "paca->soft_enabled" is used as a flag to mask some of interrupts.
      Currently supported flags values and their details:
      
      soft_enabled    MSR[EE]
      
      0               0       Disabled (PMI and HMI not masked)
      1               1       Enabled
      
      "paca->soft_enabled" is initialized to 1 to make the interripts as
      enabled. arch_local_irq_disable() will toggle the value when
      interrupts needs to disbled. At this point, the interrupts are not
      actually disabled, instead, interrupt vector has code to check for the
      flag and mask it when it occurs. By "mask it", it update interrupt
      paca->irq_happened and return. arch_local_irq_restore() is called to
      re-enable interrupts, which checks and replays interrupts if any
      occured.
      
      Now, as mentioned, current logic doesnot mask "performance monitoring
      interrupts" and PMIs are implemented as NMI. But this patchset depends
      on local_irq_* for a successful local_* update. Meaning, mask all
      possible interrupts during local_* update and replay them after the
      update.
      
      So the idea here is to reserve the "paca->soft_enabled" logic. New
      values and details:
      
      soft_enabled    MSR[EE]
      
      1               0       Disabled  (PMI and HMI not masked)
      0               1       Enabled
      
      Reason for the this change is to create foundation for a third mask
      value "0x2" for "soft_enabled" to add support to mask PMIs. When
      ->soft_enabled is set to a value "3", PMI interrupts are mask and when
      set to a value of "1", PMI are not mask. With this patch also extends
      soft_enabled as interrupt disable mask.
      
      Current flags are renamed from IRQ_[EN?DIS}ABLED to
      IRQS_ENABLED and IRQS_DISABLED.
      
      Patch also fixes the ptrace call to force the user to see the softe
      value to be alway 1. Reason being, even though userspace has no
      business knowing about softe, it is part of pt_regs. Like-wise in
      signal context.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      01417c6c
    • M
      powerpc/64: Add #defines for paca->soft_enabled flags · c2e480ba
      Madhavan Srinivasan 提交于
      Two #defines IRQS_ENABLED and IRQS_DISABLED are added to be used when
      updating paca->soft_enabled. Replace the hardcoded values used when
      updating paca->soft_enabled with IRQ_(EN|DIS)ABLED #define. No logic
      change.
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c2e480ba
  23. 13 12月, 2017 1 次提交
    • R
      powerpc/perf: Dereference BHRB entries safely · f41d84dd
      Ravi Bangoria 提交于
      It's theoretically possible that branch instructions recorded in
      BHRB (Branch History Rolling Buffer) entries have already been
      unmapped before they are processed by the kernel. Hence, trying to
      dereference such memory location will result in a crash. eg:
      
          Unable to handle kernel paging request for data at address 0xd000000019c41764
          Faulting instruction address: 0xc000000000084a14
          NIP [c000000000084a14] branch_target+0x4/0x70
          LR [c0000000000eb828] record_and_restart+0x568/0x5c0
          Call Trace:
          [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable)
          [c0000000000ec378] perf_event_interrupt+0x298/0x460
          [c000000000027964] performance_monitor_exception+0x54/0x70
          [c000000000009ba4] performance_monitor_common+0x114/0x120
      
      Fix it by deferefencing the addresses safely.
      
      Fixes: 69123184 ("powerpc/perf: Fix setting of "to" addresses for BHRB")
      Cc: stable@vger.kernel.org # v3.10+
      Suggested-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      [mpe: Use probe_kernel_read() which is clearer, tweak change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f41d84dd
  24. 04 12月, 2017 1 次提交
    • R
      powerpc/perf: Fix oops when grouping different pmu events · 5aa04b3e
      Ravi Bangoria 提交于
      When user tries to group imc (In-Memory Collections) event with
      normal event, (sometime) kernel crashes with following log:
      
          Faulting instruction address: 0x00000000
          [link register   ] c00000000010ce88 power_check_constraints+0x128/0x980
          ...
          c00000000010e238 power_pmu_event_init+0x268/0x6f0
          c0000000002dc60c perf_try_init_event+0xdc/0x1a0
          c0000000002dce88 perf_event_alloc+0x7b8/0xac0
          c0000000002e92e0 SyS_perf_event_open+0x530/0xda0
          c00000000000b004 system_call+0x38/0xe0
      
      'event_base' field of 'struct hw_perf_event' is used as flags for
      normal hw events and used as memory address for imc events. While
      grouping these two types of events, collect_events() tries to
      interpret imc 'event_base' as a flag, which causes a corruption
      resulting in a crash.
      
      Consider only those events which belongs to 'perf_hw_context' in
      collect_events().
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-By: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5aa04b3e
  25. 20 9月, 2017 1 次提交
  26. 29 8月, 2017 1 次提交
    • K
      perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR · fc7ce9c7
      Kan Liang 提交于
      For understanding how the workload maps to memory channels and hardware
      behavior, it's very important to collect address maps with physical
      addresses. For example, 3D XPoint access can only be found by filtering
      the physical address.
      
      Add a new sample type for physical address.
      
      perf already has a facility to collect data virtual address. This patch
      introduces a function to convert the virtual address to physical address.
      The function is quite generic and can be extended to any architecture as
      long as a virtual address is provided.
      
       - For kernel direct mapping addresses, virt_to_phys is used to convert
         the virtual addresses to physical address.
      
       - For user virtual addresses, __get_user_pages_fast is used to walk the
         pages tables for user physical address.
      
       - This does not work for vmalloc addresses right now. These are not
         resolved, but code to do that could be added.
      
      The new sample type requires collecting the virtual address. The
      virtual address will not be output unless SAMPLE_ADDR is applied.
      
      For security, the physical address can only be exposed to root or
      privileged user.
      Tested-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: mpe@ellerman.id.au
      Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc7ce9c7
  27. 19 4月, 2017 1 次提交