1. 17 1月, 2018 1 次提交
    • M
      powerpc/64s: Wire up cpu_show_meltdown() · fd6e440f
      Michael Ellerman 提交于
      The recent commit 87590ce6 ("sysfs/cpu: Add vulnerability folder")
      added a generic folder and set of files for reporting information on
      CPU vulnerabilities. One of those was for meltdown:
      
        /sys/devices/system/cpu/vulnerabilities/meltdown
      
      This commit wires up that file for 64-bit Book3S powerpc.
      
      For now we default to "Vulnerable" unless the RFI flush is enabled.
      That may not actually be true on all hardware, further patches will
      refine the reporting based on the CPU/platform etc. But for now we
      default to being pessimists.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fd6e440f
  2. 10 1月, 2018 12 次提交
  3. 09 1月, 2018 1 次提交
  4. 08 1月, 2018 1 次提交
    • M
      powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQ · e2d59152
      Michael Ellerman 提交于
      The hotplug code uses its own workqueue to handle IRQ requests
      (pseries_hp_wq), however that workqueue is initialized after
      init_ras_IRQ(). That can lead to a kernel panic if any hotplug
      interrupts fire after init_ras_IRQ() but before pseries_hp_wq is
      initialised. eg:
      
        UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
        NET: Registered protocol family 1
        Unpacking initramfs...
        (qemu) object_add memory-backend-ram,id=mem1,size=10G
        (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
        Unable to handle kernel paging request for data at address 0xf94d03007c421378
        Faulting instruction address: 0xc00000000012d744
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-ziviani+ #26
        task:         (ptrval) task.stack:         (ptrval)
        NIP:  c00000000012d744 LR: c00000000012d744 CTR: 0000000000000000
        REGS:         (ptrval) TRAP: 0380   Not tainted  (4.15.0-rc2-ziviani+)
        MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28088042  XER: 20040000
        CFAR: c00000000012d3c4 SOFTE: 0
        ...
        NIP [c00000000012d744] __queue_work+0xd4/0x5c0
        LR [c00000000012d744] __queue_work+0xd4/0x5c0
        Call Trace:
        [c0000000fffefb90] [c00000000012d744] __queue_work+0xd4/0x5c0 (unreliable)
        [c0000000fffefc70] [c00000000012dce4] queue_work_on+0xb4/0xf0
      
      This commit makes the RAS IRQ registration explicitly dependent on the
      creation of the pseries_hp_wq.
      Reported-by: NMin Deng <mdeng@redhat.com>
      Reported-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
      Tested-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      e2d59152
  5. 02 1月, 2018 1 次提交
    • J
      powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR · ecb101ae
      John Sperbeck 提交于
      The recent refactoring of the powerpc page fault handler in commit
      c3350602 ("powerpc/mm: Make bad_area* helper functions") caused
      access to protected memory regions to indicate SEGV_MAPERR instead of
      the traditional SEGV_ACCERR in the si_code field of a user-space
      signal handler. This can confuse debug libraries that temporarily
      change the protection of memory regions, and expect to use SEGV_ACCERR
      as an indication to restore access to a region.
      
      This commit restores the previous behavior. The following program
      exhibits the issue:
      
          $ ./repro read  || echo "FAILED"
          $ ./repro write || echo "FAILED"
          $ ./repro exec  || echo "FAILED"
      
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <unistd.h>
          #include <signal.h>
          #include <sys/mman.h>
          #include <assert.h>
      
          static void segv_handler(int n, siginfo_t *info, void *arg) {
                  _exit(info->si_code == SEGV_ACCERR ? 0 : 1);
          }
      
          int main(int argc, char **argv)
          {
                  void *p = NULL;
                  struct sigaction act = {
                          .sa_sigaction = segv_handler,
                          .sa_flags = SA_SIGINFO,
                  };
      
                  assert(argc == 2);
                  p = mmap(NULL, getpagesize(),
                          (strcmp(argv[1], "write") == 0) ? PROT_READ : 0,
                          MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
                  assert(p != MAP_FAILED);
      
                  assert(sigaction(SIGSEGV, &act, NULL) == 0);
                  if (strcmp(argv[1], "read") == 0)
                          printf("%c", *(unsigned char *)p);
                  else if (strcmp(argv[1], "write") == 0)
                          *(unsigned char *)p = 0;
                  else if (strcmp(argv[1], "exec") == 0)
                          ((void (*)(void))p)();
                  return 1;  /* failed to generate SEGV */
          }
      
      Fixes: c3350602 ("powerpc/mm: Make bad_area* helper functions")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NJohn Sperbeck <jsperbeck@google.com>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Add commit references in change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ecb101ae
  6. 22 12月, 2017 2 次提交
  7. 19 12月, 2017 1 次提交
    • M
      powerpc/kernel: Print actual address of regs when oopsing · 182dc9c7
      Michael Ellerman 提交于
      When we oops or otherwise call show_regs() we print the address of the
      regs structure. Being able to see the address is fairly useful,
      firstly to verify that the regs pointer is not completely bogus, and
      secondly it allows you to dump the regs and surrounding memory with a
      debugger if you have one.
      
      In the normal case the regs will be located somewhere on the stack, so
      printing their location discloses no further information than printing
      the stack pointer does already.
      
      So switch to %px and print the actual address, not the hashed value.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      182dc9c7
  8. 13 12月, 2017 3 次提交
    • A
      powerpc/perf: Fix kfree memory allocated for nest pmus · 110df8bd
      Anju T Sudhakar 提交于
      imc_common_cpuhp_mem_free() is the common function for all
      IMC (In-memory Collection counters) domains to unregister cpuhotplug
      callback and free memory. Since kfree of memory allocated for
      nest-imc (per_nest_pmu_arr) is in the common code, all
      domains (core/nest/thread) can do the kfree in the failure case.
      
      This could potentially create a call trace as shown below, where
      core(/thread/nest) imc pmu initialization fails and in the failure
      path imc_common_cpuhp_mem_free() free the memory(per_nest_pmu_arr),
      which is allocated by successfully registered nest units.
      
      The call trace is generated in a scenario where core-imc
      initialization is made to fail and a cpuhotplug is performed in a p9
      system. During cpuhotplug ppc_nest_imc_cpu_offline() tries to access
      per_nest_pmu_arr, which is already freed by core-imc.
      
        NIP [c000000000cb6a94] mutex_lock+0x34/0x90
        LR [c000000000cb6a88] mutex_lock+0x28/0x90
        Call Trace:
          mutex_lock+0x28/0x90 (unreliable)
          perf_pmu_migrate_context+0x90/0x3a0
          ppc_nest_imc_cpu_offline+0x190/0x1f0
          cpuhp_invoke_callback+0x160/0x820
          cpuhp_thread_fun+0x1bc/0x270
          smpboot_thread_fn+0x250/0x290
          kthread+0x1a8/0x1b0
          ret_from_kernel_thread+0x5c/0x74
      
      To address this scenario do the kfree(per_nest_pmu_arr) only in case
      of nest-imc initialization failure, and when there is no other nest
      units registered.
      
      Fixes: 73ce9aec ("powerpc/perf: Fix IMC_MAX_PMU macro")
      Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      110df8bd
    • A
      powerpc/perf/imc: Fix nest-imc cpuhotplug callback failure · ad2b6e01
      Anju T Sudhakar 提交于
      Oops is observed during boot:
      
        Faulting instruction address: 0xc000000000248340
        cpu 0x0: Vector: 380 (Data Access Out of Range) at [c000000ff66fb850]
            pc: c000000000248340: event_function_call+0x50/0x1f0
            lr: c00000000024878c: perf_remove_from_context+0x3c/0x100
            sp: c000000ff66fbad0
           msr: 9000000000009033
           dar: 7d20e2a6f92d03c0
          pid = 14, comm = cpuhp/0
      
      While registering the cpuhotplug callbacks for nest-imc, if we fail in
      the cpuhotplug online path for any random node in a multi node
      system (because the opal call to stop nest-imc counters fails for that
      node), ppc_nest_imc_cpu_offline() will get invoked for other nodes who
      successfully returned from cpuhotplug online path.
      
      This call trace is generated since in the ppc_nest_imc_cpu_offline()
      path we are trying to migrate the event context, when nest-imc
      counters are not even initialized.
      
      Patch to add a check to ensure that nest-imc is registered before
      migrating the event context.
      
      Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support")
      Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ad2b6e01
    • R
      powerpc/perf: Dereference BHRB entries safely · f41d84dd
      Ravi Bangoria 提交于
      It's theoretically possible that branch instructions recorded in
      BHRB (Branch History Rolling Buffer) entries have already been
      unmapped before they are processed by the kernel. Hence, trying to
      dereference such memory location will result in a crash. eg:
      
          Unable to handle kernel paging request for data at address 0xd000000019c41764
          Faulting instruction address: 0xc000000000084a14
          NIP [c000000000084a14] branch_target+0x4/0x70
          LR [c0000000000eb828] record_and_restart+0x568/0x5c0
          Call Trace:
          [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable)
          [c0000000000ec378] perf_event_interrupt+0x298/0x460
          [c000000000027964] performance_monitor_exception+0x54/0x70
          [c000000000009ba4] performance_monitor_common+0x114/0x120
      
      Fix it by deferefencing the addresses safely.
      
      Fixes: 69123184 ("powerpc/perf: Fix setting of "to" addresses for BHRB")
      Cc: stable@vger.kernel.org # v3.10+
      Suggested-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      [mpe: Use probe_kernel_read() which is clearer, tweak change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f41d84dd
  9. 06 12月, 2017 2 次提交
    • M
      powerpc/xmon: Don't print hashed pointers in xmon · d8104182
      Michael Ellerman 提交于
      Since commit ad67b74d ("printk: hash addresses printed with %p")
      pointers printed with %p are hashed, ie. you don't see the actual
      pointer value but rather a cryptographic hash of its value.
      
      In xmon we want to see the actual pointer values, because xmon is a
      debugger, so replace %p with %px which prints the actual pointer
      value.
      
      We justify doing this in xmon because 1) xmon is a kernel crash
      debugger, it's only accessible via the console 2) xmon doesn't print
      to dmesg, so the pointers it prints are not able to be leaked that
      way.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d8104182
    • N
      powerpc/64s: Initialize ISAv3 MMU registers before setting partition table · 371b8044
      Nicholas Piggin 提交于
      kexec can leave MMU registers set when booting into a new kernel,
      the PIDR (Process Identification Register) in particular. The boot
      sequence does not zero PIDR, so it only gets set when CPUs first
      switch to a userspace processes (until then it's running a kernel
      thread with effective PID = 0).
      
      This leaves a window where a process table entry and page tables are
      set up due to user processes running on other CPUs, that happen to
      match with a stale PID. The CPU with that PID may cause speculative
      accesses that address quadrant 0 (aka userspace addresses), which will
      result in cached translations and PWC (Page Walk Cache) for that
      process, on a CPU which is not in the mm_cpumask and so they will not
      be invalidated properly.
      
      The most common result is the kernel hanging in infinite page fault
      loops soon after kexec (usually in schedule_tail, which is usually the
      first non-speculative quadrant 0 access to a new PID) due to a stale
      PWC. However being a stale translation error, it could result in
      anything up to security and data corruption problems.
      
      Fix this by zeroing out PIDR at boot and kexec.
      
      Fixes: 7e381c0f ("powerpc/mm/radix: Add mmu context handling callback for radix")
      Cc: stable@vger.kernel.org # v4.7+
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      371b8044
  10. 05 12月, 2017 1 次提交
    • D
      Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" · ab9dbf77
      David Gibson 提交于
      This reverts commit a3b2cb30.
      
      That commit tried to fix problems with panic on powerpc in certain
      circumstances, where some output from the generic panic code was being
      dropped.
      
      Unfortunately, it breaks things worse in other circumstances. In
      particular when running a PAPR guest, it will now attempt to reboot
      instead of informing the hypervisor (KVM or PowerVM) that the guest
      has crashed. The crash notification is important to some
      virtualization management layers.
      
      Revert it for now until we can come up with a better solution.
      
      Fixes: a3b2cb30 ("powerpc: Do not call ppc_md.panic in fadump panic notifier")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Tweak change log a bit]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ab9dbf77
  11. 04 12月, 2017 1 次提交
    • R
      powerpc/perf: Fix oops when grouping different pmu events · 5aa04b3e
      Ravi Bangoria 提交于
      When user tries to group imc (In-Memory Collections) event with
      normal event, (sometime) kernel crashes with following log:
      
          Faulting instruction address: 0x00000000
          [link register   ] c00000000010ce88 power_check_constraints+0x128/0x980
          ...
          c00000000010e238 power_pmu_event_init+0x268/0x6f0
          c0000000002dc60c perf_try_init_event+0xdc/0x1a0
          c0000000002dce88 perf_event_alloc+0x7b8/0xac0
          c0000000002e92e0 SyS_perf_event_open+0x530/0xda0
          c00000000000b004 system_call+0x38/0xe0
      
      'event_base' field of 'struct hw_perf_event' is used as flags for
      normal hw events and used as memory address for imc events. While
      grouping these two types of events, collect_events() tries to
      interpret imc 'event_base' as a flag, which causes a corruption
      resulting in a crash.
      
      Consider only those events which belongs to 'perf_hw_context' in
      collect_events().
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-By: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5aa04b3e
  12. 02 12月, 2017 1 次提交
  13. 01 12月, 2017 13 次提交