1. 27 8月, 2020 4 次提交
    • A
      powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc · 82715a0f
      Athira Rajeev 提交于
      IMC trace-mode uses MSR[HV/PR] bits to set the cpumode for the
      instruction pointer captured in each sample. The bits are fetched from
      the third double word of the trace record. Reading third double word
      from IMC trace record should use be64_to_cpu() along with READ_ONCE
      inorder to fetch correct MSR[HV/PR] bits. Patch addresses this change.
      
      Currently we are using PERF_RECORD_MISC_HYPERVISOR as cpumode if MSR
      HV is 1 and PR is 0 which means the address is from host counter. But
      using PERF_RECORD_MISC_HYPERVISOR for host counter data will fail to
      resolve the address -> symbol during "perf report" because perf tools
      side uses PERF_RECORD_MISC_KERNEL to represent the host counter data.
      Therefore, fix the trace imc sample data to use
      PERF_RECORD_MISC_KERNEL as cpumode for host kernel information.
      
      Fixes: 77ca3951 ("powerpc/perf: Add kernel support for new MSR[HV PR] bits in trace-imc")
      Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1598424029-1662-1-git-send-email-atrajeev@linux.vnet.ibm.com
      82715a0f
    • A
      powerpc/perf: Fix crashes with generic_compat_pmu & BHRB · b460b512
      Alexey Kardashevskiy 提交于
      The bhrb_filter_map ("The Branch History Rolling Buffer") callback is
      only defined in raw CPUs' power_pmu structs. The "architected" CPUs
      use generic_compat_pmu, which does not have this callback, and crashes
      occur if a user tries to enable branch stack for an event.
      
      This add a NULL pointer check for bhrb_filter_map() which behaves as
      if the callback returned an error.
      
      This does not add the same check for config_bhrb() as the only caller
      checks for cpuhw->bhrb_users which remains zero if bhrb_filter_map==0.
      
      Fixes: be80e758 ("powerpc/perf: Add generic compat mode pmu driver")
      Cc: stable@vger.kernel.org # v5.2+
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200602025612.62707-1-aik@ozlabs.ru
      b460b512
    • M
      powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode · b91eb518
      Michael Ellerman 提交于
      The recent commit 01eb0187 ("powerpc/64s: Fix restore_math
      unnecessarily changing MSR") changed some of the handling of floating
      point/vector restore.
      
      In particular it caused current->thread.fpexc_mode to be copied into
      the current MSR (via msr_check_and_set()), rather than just into
      regs->msr (which is moved into MSR on return to userspace).
      
      This can lead to a crash in the kernel if we take a floating point
      exception when restoring FPSCR:
      
        Oops: Exception in kernel mode, sig: 8 [#1]
        LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
        Modules linked in:
        CPU: 3 PID: 101213 Comm: ld64.so.2 Not tainted 5.9.0-rc1-00098-g18445bf4-dirty #9
        NIP:  c00000000000fbb4 LR: c00000000001a7ac CTR: c000000000183570
        REGS: c0000016b7cfb3b0 TRAP: 0700   Not tainted  (5.9.0-rc1-00098-g18445bf4-dirty)
        MSR:  900000000290b933 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002444  XER: 00000000
        CFAR: c00000000001a7a8 IRQMASK: 1
        GPR00: c00000000001ae40 c0000016b7cfb640 c0000000011b7f00 c000001542a0f740
        GPR04: c000001542a0f720 c000001542a0eb00 0000000000000900 c000001542a0eb00
        GPR08: 000000000000000a 0000000000002000 9000000000009033 0000000000000000
        GPR12: 0000000000004000 c0000017ffffd900 0000000000000001 c000000000df5a58
        GPR16: c000000000e19c18 c0000000010e1123 0000000000000001 c000000000e1a638
        GPR20: 0000000000000000 c0000000044b1d00 0000000000000000 c000001542a0f2a0
        GPR24: 00000016c7fe0000 c000001542a0f720 c000000001c93da0 c000000000fe5f28
        GPR28: c000001542a0f720 0000000000800000 c0000016b7cfbe90 0000000002802900
        NIP load_fp_state+0x4/0x214
        LR  restore_math+0x17c/0x1f0
        Call Trace:
          0xc0000016b7cfb680 (unreliable)
          __switch_to+0x330/0x460
          __schedule+0x318/0x920
          schedule+0x74/0x140
          schedule_timeout+0x318/0x3f0
          wait_for_completion+0xc8/0x210
          call_usermodehelper_exec+0x234/0x280
          do_coredump+0xedc/0x13c0
          get_signal+0x1d4/0xbe0
          do_notify_resume+0x1a0/0x490
          interrupt_exit_user_prepare+0x1c4/0x230
          interrupt_return+0x14/0x1c0
        Instruction dump:
        ebe10168 e88101a0 7c8ff120 382101e0 e8010010 7c0803a6 4e800020 790605c4
        782905c4 7c0008a8 7c0008a8 c8030200 <fffe058e> 48000088 c8030000 c8230010
      
      Fix it by only loading the fpexc_mode value into regs->msr.
      
      Also add a comment to explain that although VSX is subject to the
      value of fpexc_mode, we don't have to handle that separately because
      we only allow VSX to be enabled if FP is also enabled.
      
      Fixes: 01eb0187 ("powerpc/64s: Fix restore_math unnecessarily changing MSR")
      Reported-by: NMilton Miller <miltonm@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Link: https://lore.kernel.org/r/20200825093424.3967813-1-mpe@ellerman.id.au
      b91eb518
    • N
      powerpc/64s: scv entry should set PPR · e5fe5609
      Nicholas Piggin 提交于
      Kernel entry sets PPR to HMT_MEDIUM by convention. The scv entry
      path missed this.
      
      Fixes: 7fa95f9a ("powerpc/64s: system call support for scv/rfscv instructions")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200825075309.224184-1-npiggin@gmail.com
      e5fe5609
  2. 24 8月, 2020 2 次提交
  3. 21 8月, 2020 2 次提交
  4. 20 8月, 2020 3 次提交
  5. 18 8月, 2020 3 次提交
    • M
      powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death · 801980f6
      Michael Roth 提交于
      For a power9 KVM guest with XIVE enabled, running a test loop
      where we hotplug 384 vcpus and then unplug them, the following traces
      can be seen (generally within a few loops) either from the unplugged
      vcpu:
      
        cpu 65 (hwid 65) Ready to die...
        Querying DEAD? cpu 66 (66) shows 2
        list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048
        ------------[ cut here ]------------
        kernel BUG at lib/list_debug.c:56!
        Oops: Exception in kernel mode, sig: 5 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ...
        CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1
        NIP:  c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac
        REGS: c0000009e5a17840 TRAP: 0700   Not tainted  (4.18.0-221.el8.ppc64le)
        MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000842  XER: 20040000
        ...
        NIP __list_del_entry_valid+0xac/0x100
        LR  __list_del_entry_valid+0xa8/0x100
        Call Trace:
          __list_del_entry_valid+0xa8/0x100 (unreliable)
          free_pcppages_bulk+0x1f8/0x940
          free_unref_page+0xd0/0x100
          xive_spapr_cleanup_queue+0x148/0x1b0
          xive_teardown_cpu+0x1bc/0x240
          pseries_mach_cpu_die+0x78/0x2f0
          cpu_die+0x48/0x70
          arch_cpu_idle_dead+0x20/0x40
          do_idle+0x2f4/0x4c0
          cpu_startup_entry+0x38/0x40
          start_secondary+0x7bc/0x8f0
          start_secondary_prolog+0x10/0x14
      
      or on the worker thread handling the unplug:
      
        pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
        Querying DEAD? cpu 314 (314) shows 2
        BUG: Bad page state in process kworker/u768:3  pfn:95de1
        cpu 314 (hwid 314) Ready to die...
        page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
        flags: 0x5ffffc00000000()
        raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000
        raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000
        page dumped because: nonzero mapcount
        Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ...
        CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1
        Workqueue: pseries hotplug workque pseries_hp_work_fn
        Call Trace:
          dump_stack+0xb0/0xf4 (unreliable)
          bad_page+0x12c/0x1b0
          free_pcppages_bulk+0x5bc/0x940
          page_alloc_cpu_dead+0x118/0x120
          cpuhp_invoke_callback.constprop.5+0xb8/0x760
          _cpu_down+0x188/0x340
          cpu_down+0x5c/0xa0
          cpu_subsys_offline+0x24/0x40
          device_offline+0xf0/0x130
          dlpar_offline_cpu+0x1c4/0x2a0
          dlpar_cpu_remove+0xb8/0x190
          dlpar_cpu_remove_by_index+0x12c/0x150
          dlpar_cpu+0x94/0x800
          pseries_hp_work_fn+0x128/0x1e0
          process_one_work+0x304/0x5d0
          worker_thread+0xcc/0x7a0
          kthread+0x1ac/0x1c0
          ret_from_kernel_thread+0x5c/0x80
      
      The latter trace is due to the following sequence:
      
        page_alloc_cpu_dead
          drain_pages
            drain_pages_zone
              free_pcppages_bulk
      
      where drain_pages() in this case is called under the assumption that
      the unplugged cpu is no longer executing. To ensure that is the case,
      and early call is made to __cpu_die()->pseries_cpu_die(), which runs a
      loop that waits for the cpu to reach a halted state by polling its
      status via query-cpu-stopped-state RTAS calls. It only polls for 25
      iterations before giving up, however, and in the trace above this
      results in the following being printed only .1 seconds after the
      hotplug worker thread begins processing the unplug request:
      
        pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
        Querying DEAD? cpu 314 (314) shows 2
      
      At that point the worker thread assumes the unplugged CPU is in some
      unknown/dead state and procedes with the cleanup, causing the race
      with the XIVE cleanup code executed by the unplugged CPU.
      
      Fix this by waiting indefinitely, but also making an effort to avoid
      spurious lockup messages by allowing for rescheduling after polling
      the CPU status and printing a warning if we wait for longer than 120s.
      
      Fixes: eac1e731 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
      Tested-by: NGreg Kurz <groug@kaod.org>
      Reviewed-by: NThiago Jung Bauermann <bauerman@linux.ibm.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      [mpe: Trim oopses in change log slightly for readability]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com
      801980f6
    • C
      powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined · 7bee31ad
      Christophe Leroy 提交于
      When MODULES_VADDR is defined, is_module_segment() shall check the
      address against it instead of checking agains VMALLOC_START.
      
      Fixes: 6ca05532 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu
      7bee31ad
    • C
      powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32 · 48d2f040
      Christophe Leroy 提交于
      On BOOK3S_32, when we have modules and strict kernel RWX, modules
      are not in vmalloc space but in a dedicated segment that is
      below PAGE_OFFSET.
      
      So KASAN_SHADOW_START must take it into account.
      
      MODULES_VADDR can't be used because it is not defined yet
      in kasan.h
      
      Fixes: 6ca05532 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/6eddca2d5611fd57312a88eae31278c87a8fc99d.1596641224.git.christophe.leroy@csgroup.eu
      48d2f040
  6. 17 8月, 2020 8 次提交
  7. 15 8月, 2020 18 次提交