提交 · fd6e440f20b1a4304553775fc55938848ff617c9 · openanolis / cloud-kernel

17 1月, 2018 1 次提交

powerpc/64s: Wire up cpu_show_meltdown() · fd6e440f

由 Michael Ellerman 提交于 1月 16, 2018

The recent commit 87590ce6 ("sysfs/cpu: Add vulnerability folder")
added a generic folder and set of files for reporting information on
CPU vulnerabilities. One of those was for meltdown:

  /sys/devices/system/cpu/vulnerabilities/meltdown

This commit wires up that file for 64-bit Book3S powerpc.

For now we default to "Vulnerable" unless the RFI flush is enabled.
That may not actually be true on all hardware, further patches will
refine the reporting based on the CPU/platform etc. But for now we
default to being pessimists.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fd6e440f

10 1月, 2018 12 次提交

powerpc: Don't preempt_disable() in show_cpuinfo() · 349524bc

由 Benjamin Herrenschmidt 提交于 1月 10, 2018

This causes warnings from cpufreq mutex code. This is also rather
unnecessary and ineffective. If we really want to prevent concurrent
unplug, we could take the unplug read lock but I don't see this being
critical.

Fixes: cd77b5ce ("powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo")
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

349524bc

powerpc/xmon: Don't print hashed pointers in paca dump · 2248fade

由 Michael Ellerman 提交于 1月 11, 2018

Remember when the biggest problem we had to worry about was hashed
pointers, those were the days.

These were missed in my earlier patch because they don't match "%p",
but the macro is hiding a "%p", so these all end up being hashed,
which is not what we want in xmon. Convert them to "%px".
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2248fade

M
powerpc/xmon: Add RFI flush related fields to paca dump · 274920a3
由 Michael Ellerman 提交于 1月 10, 2018
```
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
```
274920a3

powerpc/powernv: Check device-tree for RFI flush settings · 6e032b35

由 Oliver O'Halloran 提交于 1月 10, 2018

New device-tree properties are available which tell the hypervisor
settings related to the RFI flush. Use them to determine the
appropriate flush instruction to use, and whether the flush is
required.
Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6e032b35

powerpc/pseries: Query hypervisor for RFI flush settings · 8989d568

由 Michael Neuling 提交于 1月 10, 2018

A new hypervisor call is available which tells the guest settings
related to the RFI flush. Use it to query the appropriate flush
instruction(s), and whether the flush is required.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8989d568

powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti · bc9c9304

由 Michael Ellerman 提交于 1月 10, 2018

Because there may be some performance overhead of the RFI flush, add
kernel command line options to disable it.

We add a sensibly named 'no_rfi_flush' option, but we also hijack the
x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we
see 'nopti' we can guess that the user is trying to avoid any overhead
of Meltdown mitigations, and it means we don't have to educate every
one about a different command line option.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bc9c9304

powerpc/64s: Add support for RFI flush of L1-D cache · aa8a5e00

由 Michael Ellerman 提交于 1月 10, 2018

On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.

This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.

The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.

In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.

In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.

For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.

In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: NJon Masters <jcm@redhat.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aa8a5e00

powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL · c7305645

由 Nicholas Piggin 提交于 1月 10, 2018

In the SLB miss handler we may be returning to user or kernel. We need
to add a check early on and save the result in the cr4 register, and
then we bifurcate the return path based on that.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c7305645

powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL · a08f828c

由 Nicholas Piggin 提交于 1月 10, 2018

Similar to the syscall return path, in fast_exception_return we may be
returning to user or kernel context. We already have a test for that,
because we conditionally restore r13. So use that existing test and
branch, and bifurcate the return based on that.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a08f828c

powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL · b8e90cb7

由 Nicholas Piggin 提交于 1月 10, 2018

In the syscall exit path we may be returning to user or kernel
context. We already have a test for that, because we conditionally
restore r13. So use that existing test and branch, and bifurcate the
return based on that.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b8e90cb7

powerpc/64s: Simple RFI macro conversions · 222f20f1

由 Nicholas Piggin 提交于 1月 10, 2018

This commit does simple conversions of rfi/rfid to the new macros that
include the expected destination context. By simple we mean cases
where there is a single well known destination context, and it's
simply a matter of substituting the instruction for the appropriate
macro.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

222f20f1

powerpc/64: Add macros for annotating the destination of rfid/hrfid · 50e51c13

由 Nicholas Piggin 提交于 1月 10, 2018

The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is
used for switching from the kernel to userspace, and from the
hypervisor to the guest kernel. However it can and is also used for
other transitions, eg. from real mode kernel code to virtual mode
kernel code, and it's not always clear from the code what the
destination context is.

To make it clearer when reading the code, add macros which encode the
expected destination context.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

50e51c13

09 1月, 2018 1 次提交

powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper · 191eccb1

由 Michael Neuling 提交于 1月 09, 2018

A new hypervisor call has been defined to communicate various
characteristics of the CPU to guests. Add definitions for the hcall
number, flags and a wrapper function.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

191eccb1

08 1月, 2018 1 次提交

powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQ · e2d59152

由 Michael Ellerman 提交于 1月 08, 2018

The hotplug code uses its own workqueue to handle IRQ requests
(pseries_hp_wq), however that workqueue is initialized after
init_ras_IRQ(). That can lead to a kernel panic if any hotplug
interrupts fire after init_ras_IRQ() but before pseries_hp_wq is
initialised. eg:

  UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
  NET: Registered protocol family 1
  Unpacking initramfs...
  (qemu) object_add memory-backend-ram,id=mem1,size=10G
  (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
  Unable to handle kernel paging request for data at address 0xf94d03007c421378
  Faulting instruction address: 0xc00000000012d744
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-ziviani+ #26
  task:         (ptrval) task.stack:         (ptrval)
  NIP:  c00000000012d744 LR: c00000000012d744 CTR: 0000000000000000
  REGS:         (ptrval) TRAP: 0380   Not tainted  (4.15.0-rc2-ziviani+)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28088042  XER: 20040000
  CFAR: c00000000012d3c4 SOFTE: 0
  ...
  NIP [c00000000012d744] __queue_work+0xd4/0x5c0
  LR [c00000000012d744] __queue_work+0xd4/0x5c0
  Call Trace:
  [c0000000fffefb90] [c00000000012d744] __queue_work+0xd4/0x5c0 (unreliable)
  [c0000000fffefc70] [c00000000012dce4] queue_work_on+0xb4/0xf0

This commit makes the RAS IRQ registration explicitly dependent on the
creation of the pseries_hp_wq.
Reported-by: NMin Deng <mdeng@redhat.com>
Reported-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>

e2d59152

02 1月, 2018 1 次提交

powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR · ecb101ae

由 John Sperbeck 提交于 12月 31, 2017

The recent refactoring of the powerpc page fault handler in commit
c3350602 ("powerpc/mm: Make bad_area* helper functions") caused
access to protected memory regions to indicate SEGV_MAPERR instead of
the traditional SEGV_ACCERR in the si_code field of a user-space
signal handler. This can confuse debug libraries that temporarily
change the protection of memory regions, and expect to use SEGV_ACCERR
as an indication to restore access to a region.

This commit restores the previous behavior. The following program
exhibits the issue:

    $ ./repro read  || echo "FAILED"
    $ ./repro write || echo "FAILED"
    $ ./repro exec  || echo "FAILED"

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>
    #include <signal.h>
    #include <sys/mman.h>
    #include <assert.h>

    static void segv_handler(int n, siginfo_t *info, void *arg) {
            _exit(info->si_code == SEGV_ACCERR ? 0 : 1);
    }

    int main(int argc, char **argv)
    {
            void *p = NULL;
            struct sigaction act = {
                    .sa_sigaction = segv_handler,
                    .sa_flags = SA_SIGINFO,
            };

            assert(argc == 2);
            p = mmap(NULL, getpagesize(),
                    (strcmp(argv[1], "write") == 0) ? PROT_READ : 0,
                    MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
            assert(p != MAP_FAILED);

            assert(sigaction(SIGSEGV, &act, NULL) == 0);
            if (strcmp(argv[1], "read") == 0)
                    printf("%c", *(unsigned char *)p);
            else if (strcmp(argv[1], "write") == 0)
                    *(unsigned char *)p = 0;
            else if (strcmp(argv[1], "exec") == 0)
                    ((void (*)(void))p)();
            return 1;  /* failed to generate SEGV */
    }

Fixes: c3350602 ("powerpc/mm: Make bad_area* helper functions")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NJohn Sperbeck <jsperbeck@google.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Add commit references in change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ecb101ae

22 12月, 2017 2 次提交

KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp() · 7333b5ac

由 Laurent Vivier 提交于 12月 12, 2017

When we migrate a VM from a POWER8 host (XICS) to a POWER9 host
(XICS-on-XIVE), we have an error:

qemu-kvm: Unable to restore KVM interrupt controller state \
          (0xff000000) for CPU 0: Invalid argument

This is because kvmppc_xics_set_icp() checks the new state
is internaly consistent, and especially:

...
   1129         if (xisr == 0) {
   1130                 if (pending_pri != 0xff)
   1131                         return -EINVAL;
...

On the other side, kvmppc_xive_get_icp() doesn't set
neither the pending_pri value, nor the xisr value (set to 0)
(and kvmppc_xive_set_icp() ignores the pending_pri value)

As xisr is 0, pending_pri must be set to 0xff.

Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7333b5ac

KVM: PPC: Book3S: fix XIVE migration of pending interrupts · dc1c4165

由 Cédric Le Goater 提交于 12月 12, 2017

When restoring a pending interrupt, we are setting the Q bit to force
a retrigger in xive_finish_unmask(). But we also need to force an EOI
in this case to reach the same initial state : P=1, Q=0.

This can be done by not setting 'old_p' for pending interrupts which
will inform xive_finish_unmask() that an EOI needs to be sent.

Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Tested-by: NLaurent Vivier <lvivier@redhat.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

dc1c4165

19 12月, 2017 1 次提交

powerpc/kernel: Print actual address of regs when oopsing · 182dc9c7

由 Michael Ellerman 提交于 12月 18, 2017

When we oops or otherwise call show_regs() we print the address of the
regs structure. Being able to see the address is fairly useful,
firstly to verify that the regs pointer is not completely bogus, and
secondly it allows you to dump the regs and surrounding memory with a
debugger if you have one.

In the normal case the regs will be located somewhere on the stack, so
printing their location discloses no further information than printing
the stack pointer does already.

So switch to %px and print the actual address, not the hashed value.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

182dc9c7

13 12月, 2017 3 次提交

powerpc/perf: Fix kfree memory allocated for nest pmus · 110df8bd

由 Anju T Sudhakar 提交于 12月 07, 2017

imc_common_cpuhp_mem_free() is the common function for all
IMC (In-memory Collection counters) domains to unregister cpuhotplug
callback and free memory. Since kfree of memory allocated for
nest-imc (per_nest_pmu_arr) is in the common code, all
domains (core/nest/thread) can do the kfree in the failure case.

This could potentially create a call trace as shown below, where
core(/thread/nest) imc pmu initialization fails and in the failure
path imc_common_cpuhp_mem_free() free the memory(per_nest_pmu_arr),
which is allocated by successfully registered nest units.

The call trace is generated in a scenario where core-imc
initialization is made to fail and a cpuhotplug is performed in a p9
system. During cpuhotplug ppc_nest_imc_cpu_offline() tries to access
per_nest_pmu_arr, which is already freed by core-imc.

  NIP [c000000000cb6a94] mutex_lock+0x34/0x90
  LR [c000000000cb6a88] mutex_lock+0x28/0x90
  Call Trace:
    mutex_lock+0x28/0x90 (unreliable)
    perf_pmu_migrate_context+0x90/0x3a0
    ppc_nest_imc_cpu_offline+0x190/0x1f0
    cpuhp_invoke_callback+0x160/0x820
    cpuhp_thread_fun+0x1bc/0x270
    smpboot_thread_fn+0x250/0x290
    kthread+0x1a8/0x1b0
    ret_from_kernel_thread+0x5c/0x74

To address this scenario do the kfree(per_nest_pmu_arr) only in case
of nest-imc initialization failure, and when there is no other nest
units registered.

Fixes: 73ce9aec ("powerpc/perf: Fix IMC_MAX_PMU macro")
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

110df8bd

powerpc/perf/imc: Fix nest-imc cpuhotplug callback failure · ad2b6e01

由 Anju T Sudhakar 提交于 12月 05, 2017

Oops is observed during boot:

  Faulting instruction address: 0xc000000000248340
  cpu 0x0: Vector: 380 (Data Access Out of Range) at [c000000ff66fb850]
      pc: c000000000248340: event_function_call+0x50/0x1f0
      lr: c00000000024878c: perf_remove_from_context+0x3c/0x100
      sp: c000000ff66fbad0
     msr: 9000000000009033
     dar: 7d20e2a6f92d03c0
    pid = 14, comm = cpuhp/0

While registering the cpuhotplug callbacks for nest-imc, if we fail in
the cpuhotplug online path for any random node in a multi node
system (because the opal call to stop nest-imc counters fails for that
node), ppc_nest_imc_cpu_offline() will get invoked for other nodes who
successfully returned from cpuhotplug online path.

This call trace is generated since in the ppc_nest_imc_cpu_offline()
path we are trying to migrate the event context, when nest-imc
counters are not even initialized.

Patch to add a check to ensure that nest-imc is registered before
migrating the event context.

Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support")
Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ad2b6e01

powerpc/perf: Dereference BHRB entries safely · f41d84dd

由 Ravi Bangoria 提交于 12月 12, 2017

It's theoretically possible that branch instructions recorded in
BHRB (Branch History Rolling Buffer) entries have already been
unmapped before they are processed by the kernel. Hence, trying to
dereference such memory location will result in a crash. eg:

    Unable to handle kernel paging request for data at address 0xd000000019c41764
    Faulting instruction address: 0xc000000000084a14
    NIP [c000000000084a14] branch_target+0x4/0x70
    LR [c0000000000eb828] record_and_restart+0x568/0x5c0
    Call Trace:
    [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable)
    [c0000000000ec378] perf_event_interrupt+0x298/0x460
    [c000000000027964] performance_monitor_exception+0x54/0x70
    [c000000000009ba4] performance_monitor_common+0x114/0x120

Fix it by deferefencing the addresses safely.

Fixes: 69123184 ("powerpc/perf: Fix setting of "to" addresses for BHRB")
Cc: stable@vger.kernel.org # v3.10+
Suggested-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Use probe_kernel_read() which is clearer, tweak change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f41d84dd

06 12月, 2017 2 次提交

powerpc/xmon: Don't print hashed pointers in xmon · d8104182

由 Michael Ellerman 提交于 12月 06, 2017

Since commit ad67b74d ("printk: hash addresses printed with %p")
pointers printed with %p are hashed, ie. you don't see the actual
pointer value but rather a cryptographic hash of its value.

In xmon we want to see the actual pointer values, because xmon is a
debugger, so replace %p with %px which prints the actual pointer
value.

We justify doing this in xmon because 1) xmon is a kernel crash
debugger, it's only accessible via the console 2) xmon doesn't print
to dmesg, so the pointers it prints are not able to be leaked that
way.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d8104182

powerpc/64s: Initialize ISAv3 MMU registers before setting partition table · 371b8044

由 Nicholas Piggin 提交于 12月 06, 2017

kexec can leave MMU registers set when booting into a new kernel,
the PIDR (Process Identification Register) in particular. The boot
sequence does not zero PIDR, so it only gets set when CPUs first
switch to a userspace processes (until then it's running a kernel
thread with effective PID = 0).

This leaves a window where a process table entry and page tables are
set up due to user processes running on other CPUs, that happen to
match with a stale PID. The CPU with that PID may cause speculative
accesses that address quadrant 0 (aka userspace addresses), which will
result in cached translations and PWC (Page Walk Cache) for that
process, on a CPU which is not in the mm_cpumask and so they will not
be invalidated properly.

The most common result is the kernel hanging in infinite page fault
loops soon after kexec (usually in schedule_tail, which is usually the
first non-speculative quadrant 0 access to a new PID) due to a stale
PWC. However being a stale translation error, it could result in
anything up to security and data corruption problems.

Fix this by zeroing out PIDR at boot and kexec.

Fixes: 7e381c0f ("powerpc/mm/radix: Add mmu context handling callback for radix")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

371b8044

05 12月, 2017 1 次提交

Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" · ab9dbf77

由 David Gibson 提交于 12月 04, 2017

This reverts commit a3b2cb30.

That commit tried to fix problems with panic on powerpc in certain
circumstances, where some output from the generic panic code was being
dropped.

Unfortunately, it breaks things worse in other circumstances. In
particular when running a PAPR guest, it will now attempt to reboot
instead of informing the hypervisor (KVM or PowerVM) that the guest
has crashed. The crash notification is important to some
virtualization management layers.

Revert it for now until we can come up with a better solution.

Fixes: a3b2cb30 ("powerpc: Do not call ppc_md.panic in fadump panic notifier")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
[mpe: Tweak change log a bit]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ab9dbf77

04 12月, 2017 1 次提交

powerpc/perf: Fix oops when grouping different pmu events · 5aa04b3e

由 Ravi Bangoria 提交于 11月 30, 2017

When user tries to group imc (In-Memory Collections) event with
normal event, (sometime) kernel crashes with following log:

    Faulting instruction address: 0x00000000
    [link register   ] c00000000010ce88 power_check_constraints+0x128/0x980
    ...
    c00000000010e238 power_pmu_event_init+0x268/0x6f0
    c0000000002dc60c perf_try_init_event+0xdc/0x1a0
    c0000000002dce88 perf_event_alloc+0x7b8/0xac0
    c0000000002e92e0 SyS_perf_event_open+0x530/0xda0
    c00000000000b004 system_call+0x38/0xe0

'event_base' field of 'struct hw_perf_event' is used as flags for
normal hw events and used as memory address for imc events. While
grouping these two types of events, collect_events() tries to
interpret imc 'event_base' as a flag, which causes a corruption
resulting in a crash.

Consider only those events which belongs to 'perf_hw_context' in
collect_events().
Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-By: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

5aa04b3e

02 12月, 2017 1 次提交

RISC-V: __io_writes should respect the length argument · da894ff1

由 Palmer Dabbelt 提交于 11月 28, 2017

Whoops -- I must have just been being an idiot again.  Thanks to Segher
for finding the bug :).

CC: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

da894ff1

01 12月, 2017 13 次提交

arm64: context: Fix comments and remove pointless smp_wmb() · 3a33c760

由 Will Deacon 提交于 11月 30, 2017

The comments in the ASID allocator incorrectly hint at an MP-style idiom
using the asid_generation and the active_asids array. In fact, the
synchronisation is achieved using a combination of an xchg operation
and a spinlock, so update the comments and remove the pointless smp_wmb().

Cc: James Morse <james.morse@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3a33c760

arm64: cpu_ops: Add missing 'const' qualifiers · 770ba060

由 Yury Norov 提交于 11月 29, 2017

Building the kernel with an LTO-enabled GCC spits out the following "const"
warning for the cpu_ops code:

  mm/percpu.c:2168:20: error: pcpu_fc_names causes a section type conflict
  with dt_supported_cpu_ops
  const char * const pcpu_fc_names[PCPU_FC_NR] __initconst = {
          ^
  arch/arm64/kernel/cpu_ops.c:34:37: note: ‘dt_supported_cpu_ops’ was declared here
  static const struct cpu_operations *dt_supported_cpu_ops[] __initconst = {

Fix it by adding missed const qualifiers.
Signed-off-by: NYury Norov <ynorov@caviumnetworks.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

770ba060

arm64: perf: remove unsupported events for Cortex-A73 · f8ada189

由 Xu YiPing 提交于 11月 15, 2017

bus access read/write events are not supported in A73, based on the
Cortex-A73 TRM r0p2, section 11.9 Events (pages 11-457 to 11-460).

Fixes: 5561b6c5 "arm64: perf: add support for Cortex-A73"
Acked-by: NJulien Thierry <julien.thierry@arm.com>
Signed-off-by: NXu YiPing <xuyiping@hisilicon.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f8ada189

arm64: fpsimd: Fix failure to restore FPSIMD state after signals · 9de52a75

由 Dave Martin 提交于 11月 30, 2017

The fpsimd_update_current_state() function is responsible for
loading the FPSIMD state from the user signal frame into the
current task during sigreturn.  When implementing support for SVE,
conditional code was added to this function in order to handle the
case where SVE state need to be loaded for the task and merged with
the FPSIMD data from the signal frame; however, the FPSIMD-only
case was unintentionally dropped.

As a result of this, sigreturn does not currently restore the
FPSIMD state of the task, except in the case where the system
supports SVE and the signal frame contains SVE state in addition to
FPSIMD state.

This patch fixes this bug by making the copy-in of the FPSIMD data
from the signal frame to thread_struct unconditional.

This remains a performance regression from v4.14, since the FPSIMD
state is now copied into thread_struct and then loaded back,
instead of _only_ being loaded into the CPU FPSIMD registers.
However, it is essential to call task_fpsimd_load() here anyway in
order to ensure that the SVE enable bit in CPACR_EL1 is set
correctly before returning to userspace.  This could use some
refactoring, but since sigreturn is not a fast path I have kept
this patch as a pure fix and left the refactoring for later.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: 8cd969d2 ("arm64/sve: Signal handling support")
Reported-by: NAlex Bennée <alex.bennee@linaro.org>
Tested-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9de52a75

arm64: pgd: Mark pgd_cache as __ro_after_init · a349b302

由 Jinbum Park 提交于 11月 22, 2017

pgd_cache is setup once while init stage and never changed after
that, so it is good candidate for __ro_after_init
Signed-off-by: NJinbum Park <jinb.park7@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a349b302

arm64: ftrace: emit ftrace-mod.o contents through code · be0f272b

由 Ard Biesheuvel 提交于 11月 20, 2017

When building the arm64 kernel with both CONFIG_ARM64_MODULE_PLTS and
CONFIG_DYNAMIC_FTRACE enabled, the ftrace-mod.o object file is built
with the kernel and contains a trampoline that is linked into each
module, so that modules can be loaded far away from the kernel and
still reach the ftrace entry point in the core kernel with an ordinary
relative branch, as is emitted by the compiler instrumentation code
dynamic ftrace relies on.

In order to be able to build out of tree modules, this object file
needs to be included into the linux-headers or linux-devel packages,
which is undesirable, as it makes arm64 a special case (although a
precedent does exist for 32-bit PPC).

Given that the trampoline essentially consists of a PLT entry, let's
not bother with a source or object file for it, and simply patch it
in whenever the trampoline is being populated, using the existing
PLT support routines.

Cc: <stable@vger.kernel.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

be0f272b

arm64: module-plts: factor out PLT generation code for ftrace · 7e8b9c1d

由 Ard Biesheuvel 提交于 11月 20, 2017

To allow the ftrace trampoline code to reuse the PLT entry routines,
factor it out and move it into asm/module.h.

Cc: <stable@vger.kernel.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7e8b9c1d

RISC-V: Clean up an unused include · 0e710ac6

由 Palmer Dabbelt 提交于 11月 21, 2017

We used to have some cmpxchg syscalls.  They're no longer there, so we
no longer need the include.

CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

0e710ac6

RISC-V: Allow userspace to flush the instruction cache · 921ebd8f

由 Andrew Waterman 提交于 10月 25, 2017

Despite RISC-V having a direct 'fence.i' instruction available to
userspace (which we can't trap!), that's not actually viable when
running on Linux because the kernel might schedule a process on another
hart. There is no way for userspace to handle this without invoking the
kernel (as it doesn't know the thread->hart mappings), so we've defined
a RISC-V specific system call to flush the instruction cache.

This patch adds both a system call and a VDSO entry. If possible, we'd
like to avoid having the system call be considered part of the
user-facing ABI and instead restrict that to the VDSO entry -- both just
in general to avoid having additional user-visible ABI to maintain, and
because we'd prefer that users just call the VDSO entry because there
might be a better way to do this in the future (ie, one that doesn't
require entering the kernel).
Signed-off-by: NAndrew Waterman <andrew@sifive.com>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

921ebd8f

RISC-V: Flush I$ when making a dirty page executable · 08f051ed

由 Andrew Waterman 提交于 10月 25, 2017

The RISC-V ISA allows for instruction caches that are not coherent WRT
stores, even on a single hart.  As a result, we need to explicitly flush
the instruction cache whenever marking a dirty page as executable in
order to preserve the correct system behavior.

Local instruction caches aren't that scary (our implementations actually
flush the cache, but RISC-V is defined to allow higher-performance
implementations to exist), but RISC-V defines no way to perform an
instruction cache shootdown.  When explicitly asked to do so we can
shoot down remote instruction caches via an IPI, but this is a bit on
the slow side.

Instead of requiring an IPI to all harts whenever marking a page as
executable, we simply flush the currently running harts.  In order to
maintain correct behavior, we additionally mark every other hart as
needing a deferred instruction cache which will be taken before anything
runs on it.
Signed-off-by: NAndrew Waterman <andrew@sifive.com>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

08f051ed

RISC-V: Add missing include · 741fc3ff

由 Olof Johansson 提交于 11月 29, 2017

Fixes:

include/asm-generic/mm_hooks.h:20:11: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/mm_hooks.h:19:38: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

741fc3ff

O
RISC-V: Use define for get_cycles like other architectures · 4a41d5db
由 Olof Johansson 提交于 11月 29, 2017
```
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
```
4a41d5db

RISC-V: Provide stub of setup_profiling_timer() · 4bde6328

由 Olof Johansson 提交于 11月 29, 2017

Fixes the following on allmodconfig build:

profile.c:(.text+0x3e4): undefined reference to `setup_profiling_timer'
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

4bde6328

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功