1. 12 11月, 2017 1 次提交
  2. 06 11月, 2017 12 次提交
    • N
      powerpc: add POWER9_DD20 feature · b6b3755e
      Nicholas Piggin 提交于
      Cc: Michael Neuling <mikey@neuling.org>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b6b3755e
    • C
      powerpc: Always save/restore checkpointed regs during treclaim/trecheckpoint · eb5c3f1c
      Cyril Bur 提交于
      Lazy save and restore of FP/Altivec means that a userspace process can
      be sent to userspace with FP or Altivec disabled and loaded only as
      required (by way of an FP/Altivec unavailable exception). Transactional
      Memory complicates this situation as a transaction could be started
      without FP/Altivec being loaded up. This causes the hardware to
      checkpoint incorrect registers. Handling FP/Altivec unavailable
      exceptions while a thread is transactional requires a reclaim and
      recheckpoint to ensure the CPU has correct state for both sets of
      registers.
      
      tm_reclaim() has optimisations to not always save the FP/Altivec
      registers to the checkpointed save area. This was originally done
      because the caller might have information that the checkpointed
      registers aren't valid due to lazy save and restore. We've also been a
      little vague as to how tm_reclaim() leaves the FP/Altivec state since it
      doesn't necessarily always save it to the thread struct. This has lead
      to an (incorrect) assumption that it leaves the checkpointed state on
      the CPU.
      
      tm_recheckpoint() has similar optimisations in reverse. It may not
      always reload the checkpointed FP/Altivec registers from the thread
      struct before the trecheckpoint. It is therefore quite unclear where it
      expects to get the state from. This didn't help with the assumption
      made about tm_reclaim().
      
      These optimisations sit in what is by definition a slow path. If a
      process has to go through a reclaim/recheckpoint then its transaction
      will be doomed on returning to userspace. This mean that the process
      will be unable to complete its transaction and be forced to its failure
      handler. This is already an out if line case for userspace. Furthermore,
      the cost of copying 64 times 128 bits from registers isn't very long[0]
      (at all) on modern processors. As such it appears these optimisations
      have only served to increase code complexity and are unlikely to have
      had a measurable performance impact.
      
      Our transactional memory handling has been riddled with bugs. A cause
      of this has been difficulty in following the code flow, code complexity
      has not been our friend here. It makes sense to remove these
      optimisations in favour of a (hopefully) more stable implementation.
      
      This patch does mean that some times the assembly will needlessly save
      'junk' registers which will subsequently get overwritten with the
      correct value by the C code which calls the assembly function. This
      small inefficiency is far outweighed by the reduction in complexity for
      general TM code, context switching paths, and transactional facility
      unavailable exception handler.
      
      0: I tried to measure it once for other work and found that it was
      hiding in the noise of everything else I was working with. I find it
      exceedingly likely this will be the case here.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eb5c3f1c
    • C
      powerpc/opal: Add opal_async_wait_response_interruptible() to opal-async · 9aab2449
      Cyril Bur 提交于
      This patch adds an _interruptible version of opal_async_wait_response().
      This is useful when a long running OPAL call is performed on behalf of
      a userspace thread, for example, the opal_flash_{read,write,erase}
      functions performed by the powernv-flash MTD driver.
      
      It is foreseeable that these functions would take upwards of two
      minutes causing the wait_event() to block long enough to cause hung
      task warnings. Furthermore, wait_event_interruptible() is preferable
      as otherwise there is no way for signals to stop the process which is
      going to be confusing in userspace.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9aab2449
    • C
      powerpc/opal: Make __opal_async_{get, release}_token() static · 59cf9a1c
      Cyril Bur 提交于
      There are no callers of both __opal_async_get_token() and
      __opal_async_release_token().
      
      This patch also removes the possibility of "emergency through
      synchronous call to __opal_async_get_token()" as such it makes more
      sense to initialise opal_sync_sem for the maximum number of async
      tokens.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      59cf9a1c
    • A
      powerpc/eeh: Stop using do_gettimeofday() · edfd17ff
      Arnd Bergmann 提交于
      This interface is inefficient and deprecated because of the y2038
      overflow.
      
      ktime_get_seconds() is an appropriate replacement here, since it
      has sufficient granularity but is more efficient and uses monotonic
      time.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Acked-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      edfd17ff
    • M
      powerpc/64s: Replace CONFIG_PPC_STD_MMU_64 with CONFIG_PPC_BOOK3S_64 · 4e003747
      Michael Ellerman 提交于
      CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU
      on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU
      found in "server" processors, from IBM mainly.
      
      Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's
      annoying to have two symbols that always have the same value, it's not
      quite annoying enough to bother removing one.
      
      However with the arrival of Power9, we now have the situation where
      CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the
      Radix MMU - *not* the "standard" MMU. So it is now actively confusing
      to use it, because it implies that code is disabled or inactive when
      the Radix MMU is in use, however that is not necessarily true.
      
      So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor
      formatting updates of some of the affected lines.
      
      This will be a pain for backports, but c'est la vie.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4e003747
    • M
      powerpc/64: Free up CPU_FTR_ICSWX · c1807e3f
      Michael Ellerman 提交于
      The last user of CPU_FTR_ICSWX was removed in commit
      6ff4d3e9 ("powerpc: Remove old unused icswx based coprocessor
      support"), so free the bit up for future use.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c1807e3f
    • A
      powerpc/powernv: Reserve a hole which appears after enabling IOV · d6f934fd
      Alexey Kardashevskiy 提交于
      In order to make generic IOV code work, the physical function IOV BAR
      should start from offset of the first VF. Since M64 segments share
      PE number space across PHB, and some PEs may be in use at the time
      when IOV is enabled, the existing code shifts the IOV BAR to the index
      of the first PE/VF. This creates a hole in IOMEM space which can be
      potentially taken by some other device.
      
      This reserves a temporary hole on a parent and releases it when IOV is
      disabled; the temporary resources are stored in pci_dn to avoid
      kmalloc/free.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d6f934fd
    • N
      powerpc/64s/radix: Fix process table entry cache invalidation · 30b49ec7
      Nicholas Piggin 提交于
      According to the architecture, the process table entry cache must be
      flushed with tlbie RIC=2.
      
      Currently the process table entry is set to invalid right before the
      PID is returned to the allocator, with no invalidation. This works on
      existing implementations that are known to not cache the process table
      entry for any except the current PIDR.
      
      It is architecturally correct and cleaner to invalidate with RIC=2
      after clearing the process table entry and before the PID is returned
      to the allocator. This can be done in arch_exit_mmap that runs before
      the final flush, and to ensure the final flush (fullmm) is always a
      RIC=2 variant.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      30b49ec7
    • N
      powerpc/book3s: Add an HV variant of FIXUP_ENDIAN that is recoverable · 8ca9c08d
      Nicholas Piggin 提交于
      Add an HV variant of FIXUP_ENDIAN which uses HSRR[01] and does not
      clear MSR[RI], which improves recoverability.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8ca9c08d
    • N
    • N
      KVM: PPC: Book3S HV: Handle host system reset in guest mode · 6de6638b
      Nicholas Piggin 提交于
      If the host takes a system reset interrupt while a guest is running,
      the CPU must exit the guest before processing the host exception
      handler.
      
      After this patch, taking a sysrq+x with a CPU running in a guest
      gives a trace like this:
      
         cpu 0x27: Vector: 100 (System Reset) at [c000000fdf5776f0]
             pc: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
             lr: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
             sp: c000000fdf577850
            msr: 9000000002803033
           current = 0xc000000fdf4b1e00
           paca    = 0xc00000000fd4d680	 softe: 3	 irq_happened: 0x01
             pid   = 6608, comm = qemu-system-ppc
         Linux version 4.14.0-rc7-01489-g47e1893a404a-dirty #26 SMP
         [c000000fdf577a00] c008000010159dd4 kvmppc_vcpu_run_hv+0x3dc/0x12d0 [kvm_hv]
         [c000000fdf577b30] c0080000100a537c kvmppc_vcpu_run+0x44/0x60 [kvm]
         [c000000fdf577b60] c0080000100a1ae0 kvm_arch_vcpu_ioctl_run+0x118/0x310 [kvm]
         [c000000fdf577c00] c008000010093e98 kvm_vcpu_ioctl+0x530/0x7c0 [kvm]
         [c000000fdf577d50] c000000000357bf8 do_vfs_ioctl+0xd8/0x8c0
         [c000000fdf577df0] c000000000358448 SyS_ioctl+0x68/0x100
         [c000000fdf577e30] c00000000000b220 system_call+0x58/0x6c
         --- Exception: c01 (System Call) at 00007fff76868df0
         SP (7fff7069baf0) is in userspace
      
      Fixes: e36d0a2e ("powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6de6638b
  3. 01 11月, 2017 1 次提交
    • N
      Revert "powerpc64/elfv1: Only dereference function descriptor for non-text symbols" · 63be1a81
      Naveen N. Rao 提交于
      This reverts commit 83e840c7 ("powerpc64/elfv1: Only dereference
      function descriptor for non-text symbols").
      
      Chandan reported that on newer kernels, trying to enable function_graph
      tracer on ppc64 (BE) locks up the system with the following trace:
      
        Unable to handle kernel paging request for data at address 0x600000002fa30010
        Faulting instruction address: 0xc0000000001f1300
        Thread overran stack, or stack corrupted
        Oops: Kernel access of bad area, sig: 11 [#1]
        BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
        Modules linked in:
        CPU: 1 PID: 6586 Comm: bash Not tainted 4.14.0-rc3-00162-g6e51f1f-dirty #20
        task: c000000625c07200 task.stack: c000000625c07310
        NIP:  c0000000001f1300 LR: c000000000121cac CTR: c000000000061af8
        REGS: c000000625c088c0 TRAP: 0380   Not tainted  (4.14.0-rc3-00162-g6e51f1f-dirty)
        MSR:  8000000000001032 <SF,ME,IR,DR,RI>  CR: 28002848  XER: 00000000
        CFAR: c0000000001f1320 SOFTE: 0
        ...
        NIP [c0000000001f1300] .__is_insn_slot_addr+0x30/0x90
        LR [c000000000121cac] .kernel_text_address+0x18c/0x1c0
        Call Trace:
        [c000000625c08b40] [c0000000001bd040] .is_module_text_address+0x20/0x40 (unreliable)
        [c000000625c08bc0] [c000000000121cac] .kernel_text_address+0x18c/0x1c0
        [c000000625c08c50] [c000000000061960] .prepare_ftrace_return+0x50/0x130
        [c000000625c08cf0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
        [c000000625c08d60] [c000000000121b40] .kernel_text_address+0x20/0x1c0
        [c000000625c08df0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
        ...
        [c000000625c0ab30] [c000000000061960] .prepare_ftrace_return+0x50/0x130
        [c000000625c0abd0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
        [c000000625c0ac40] [c000000000121b40] .kernel_text_address+0x20/0x1c0
        [c000000625c0acd0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
        [c000000625c0ad70] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
        [c000000625c0ade0] [c000000000121b40] .kernel_text_address+0x20/0x1c0
      
      This is because ftrace is using ppc_function_entry() for obtaining the
      address of return_to_handler() in prepare_ftrace_return(). The call to
      kernel_text_address() itself gets traced and we end up in a recursive
      loop.
      
      Fixes: 83e840c7 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols")
      Cc: stable@vger.kernel.org # v4.13+
      Reported-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      63be1a81
  4. 22 10月, 2017 1 次提交
  5. 21 10月, 2017 1 次提交
    • M
      powerpc/powernv: Enable TM without suspend if possible · 54820530
      Michael Ellerman 提交于
      Some Power9 revisions can run in a mode where TM operates without
      suspended state. If we find ourself on a CPU that might be in this
      mode, we query OPAL to check, and if so we reenable TM in CPU
      features, and enable a new user feature to signal to userspace that we
      are in this mode.
      
      We do not enable the "normal" user feature, PPC_FEATURE2_HTM, but we
      do enable PPC_FEATURE2_HTM_NOSC because that indicates to userspace
      that the kernel will abort transactions on syscall entry, which is
      true regardless of the suspend mode.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      54820530
  6. 20 10月, 2017 1 次提交
    • M
      powerpc: Add PPC_FEATURE2_HTM_NO_SUSPEND · cba6ac48
      Michael Ellerman 提交于
      Some CPUs can operate in a mode where TM (Transactional Memory) is
      enabled but the suspended state of TM is disabled. In this mode
      tsuspend does not enter suspended state, instead the transaction is
      aborted. Similarly any other event that would lead to suspended state
      instead aborts the transaction.
      
      There is also an ABI change, in that in this mode processes are not
      allowed to sigreturn with an MSR that would lead to suspended state,
      Linux will instead return an error to the sigreturn syscall.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      cba6ac48
  7. 19 10月, 2017 1 次提交
  8. 16 10月, 2017 3 次提交
  9. 06 10月, 2017 1 次提交
  10. 04 10月, 2017 2 次提交
    • N
      powerpc/kprobes: Clean up jprobe detection in livepatch handler · bf3a9125
      Naveen N. Rao 提交于
      In commit c05b8c44 ("powerpc/kprobes: Skip livepatch_handler() for
      jprobes"), we added a helper is_current_kprobe_addr() to help detect if
      the modified regs->nip was due to a jprobe or livepatch. Masami felt
      that the function name was not quite clear. To that end, this patch
      renames is_current_kprobe_addr() to __is_active_jprobe() and adds a
      comment to (hopefully) better clarify the purpose of this helper. The
      helper has also now been moved to kprobes-ftrace.c so that it is only
      available for KPROBES_ON_FTRACE.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bf3a9125
    • N
      powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET · e36d0a2e
      Nicholas Piggin 提交于
      This allows MSR[EE]=0 lockups to be detected on an OPAL (bare metal)
      system similarly to the hcall NMI IPI on pseries guests, when the
      platform/firmware supports it.
      
      This is an example of CPU10 spinning with interrupts hard disabled:
      
        Watchdog CPU:32 detected Hard LOCKUP other CPUS:10
        Watchdog CPU:10 Hard LOCKUP
        CPU: 10 PID: 4410 Comm: bash Not tainted 4.13.0-rc7-00074-ge89ce1f8-dirty #34
        task: c0000003a82b4400 task.stack: c0000003af55c000
        NIP: c0000000000a7b38 LR: c000000000659044 CTR: c0000000000a7b00
        REGS: c00000000fd23d80 TRAP: 0100   Not tainted  (4.13.0-rc7-00074-ge89ce1f8-dirty)
        MSR: 90000000000c1033 <SF,HV,ME,IR,DR,RI,LE>
        CR: 28422222  XER: 20000000
        CFAR: c0000000000a7b38 SOFTE: 0
        GPR00: c000000000659044 c0000003af55fbb0 c000000001072a00 0000000000000078
        GPR04: c0000003c81b5c80 c0000003c81cc7e8 9000000000009033 0000000000000000
        GPR08: 0000000000000000 c0000000000a7b00 0000000000000001 9000000000001003
        GPR12: c0000000000a7b00 c00000000fd83200 0000000010180df8 0000000010189e60
        GPR16: 0000000010189ed8 0000000010151270 000000001018bd88 000000001018de78
        GPR20: 00000000370a0668 0000000000000001 00000000101645e0 0000000010163c10
        GPR24: 00007fffd14d6294 00007fffd14d6290 c000000000fba6f0 0000000000000004
        GPR28: c000000000f351d8 0000000000000078 c000000000f4095c 0000000000000000
        NIP [c0000000000a7b38] sysrq_handle_xmon+0x38/0x40
        LR [c000000000659044] __handle_sysrq+0xe4/0x270
        Call Trace:
        [c0000003af55fbd0] [c000000000659044] __handle_sysrq+0xe4/0x270
        [c0000003af55fc70] [c000000000659810] write_sysrq_trigger+0x70/0xa0
        [c0000003af55fca0] [c0000000003da650] proc_reg_write+0xb0/0x110
        [c0000003af55fcf0] [c0000000003423bc] __vfs_write+0x6c/0x1b0
        [c0000003af55fd90] [c000000000344398] vfs_write+0xd8/0x240
        [c0000003af55fde0] [c00000000034632c] SyS_write+0x6c/0x110
        [c0000003af55fe30] [c00000000000b220] system_call+0x58/0x6c
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Use kernel types for opal_signal_system_reset()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e36d0a2e
  11. 28 9月, 2017 2 次提交
    • F
      cxl: Enable global TLBIs for cxl contexts · 03b8abed
      Frederic Barrat 提交于
      The PSL and nMMU need to see all TLB invalidations for the memory
      contexts used on the adapter. For the hash memory model, it is done by
      making all TLBIs global as soon as the cxl driver is in use. For
      radix, we need something similar, but we can refine and only convert
      to global the invalidations for contexts actually used by the device.
      
      The new mm_context_add_copro() API increments the 'active_cpus' count
      for the contexts attached to the cxl adapter. As soon as there's more
      than 1 active cpu, the TLBIs for the context become global. Active cpu
      count must be decremented when detaching to restore locality if
      possible and to avoid overflowing the counter.
      
      The hash memory model support is somewhat limited, as we can't
      decrement the active cpus count when mm_context_remove_copro() is
      called, because we can't flush the TLB for a mm on hash. So TLBIs
      remain global on hash.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Fixes: f24be42a ("cxl: Add psl9 specific code")
      Tested-by: NAlistair Popple <alistair@popple.id.au>
      [mpe: Fold in updated comment on the barrier from Fred]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      03b8abed
    • F
      powerpc/mm: Export flush_all_mm() · 6110236b
      Frederic Barrat 提交于
      With the optimizations introduced by commit a46cc7a9
      ("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no
      longer flushes the page walk cache (PWC) with radix. This patch
      introduces flush_all_mm(), which flushes everything, TLB and PWC, for
      a given mm.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-By: NAlistair Popple <alistair@popple.id.au>
      [mpe: Add a WARN_ON_ONCE() in the empty hash routines]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6110236b
  12. 27 9月, 2017 1 次提交
    • M
      powerpc/64s: Add workaround for P9 vector CI load issue · 5080332c
      Michael Neuling 提交于
      POWER9 DD2.1 and earlier has an issue where some cache inhibited
      vector load will return bad data. The workaround is two part, one
      firmware/microcode part triggers HMI interrupts when hitting such
      loads, the other part is this patch which then emulates the
      instructions in Linux.
      
      The affected instructions are limited to lxvd2x, lxvw4x, lxvb16x and
      lxvh8x.
      
      When an instruction triggers the HMI, all threads in the core will be
      sent to the HMI handler, not just the one running the vector load.
      
      In general, these spurious HMIs are detected by the emulation code and
      we just return back to the running process. Unfortunately, if a
      spurious interrupt occurs on a vector load that's to normal memory we
      have no way to detect that it's spurious (unless we walk the page
      tables, which is very expensive). In this case we emulate the load but
      we need do so using a vector load itself to ensure 128bit atomicity is
      preserved.
      
      Some additional debugfs emulated instruction counters are added also.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Switch CONFIG_PPC_BOOK3S_64 to CONFIG_VSX to unbreak the build]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5080332c
  13. 26 9月, 2017 1 次提交
    • B
      powerpc/powernv: Rework EEH initialization on powernv · b9fde58d
      Benjamin Herrenschmidt 提交于
      Remove the post_init callback which is only used
      by powernv, we can just call it explicitly from
      the powernv code.
      
      This partially kills the ability to "disable" eeh at
      runtime via debugfs as this was calling that same
      callback again, but this is both unused and broken
      in several ways. If we want to revive it, we need
      to create a dedicated enable/disable callback on the
      backend that does the right thing.
      
      Let the bulk of eeh initialize normally at
      core_initcall() like it does on pseries by removing
      the hack in eeh_init() that delays it.
      
      Instead we make sure our eeh->probe cleanly bails
      out of the PEs haven't been created yet and we force
      a re-probe where we used to call eeh_init() again.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b9fde58d
  14. 09 9月, 2017 1 次提交
    • M
      vga: optimise console scrolling · ac036f95
      Matthew Wilcox 提交于
      Where possible, call memset16(), memmove() or memcpy() instead of using
      open-coded loops.  I don't like the calling convention that uses a byte
      count instead of a count of u16s, but it's a little late to change that.
      Reduces code size of fbcon.o by almost 400 bytes on my laptop build.
      
      [akpm@linux-foundation.org: fix build]
      Link: http://lkml.kernel.org/r/20170720184539.31609-9-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac036f95
  15. 07 9月, 2017 1 次提交
  16. 02 9月, 2017 4 次提交
    • C
      powerpc/xive: add XIVE Exploitation Mode to CAS · ac5e5a54
      Cédric Le Goater 提交于
      On POWER9, the Client Architecture Support (CAS) negotiation process
      determines whether the guest operates in XIVE Legacy compatibility or
      in XIVE exploitation mode. Now that we have initial guest support for
      the XIVE interrupt controller, let's inform the hypervisor what we can
      do.
      
      The platform advertises the XIVE Exploitation Mode support using the
      property "ibm,arch-vec-5-platform-support-vec-5", byte 23 bits 0-1 :
      
       - 0b00 XIVE legacy mode Only
       - 0b01 XIVE exploitation mode Only
       - 0b10 XIVE legacy or exploitation mode
      
      The OS asks for XIVE Exploitation Mode support using the property
      "ibm,architecture-vec-5", byte 23 bits 0-1:
      
       - 0b00 XIVE legacy mode Only
       - 0b01 XIVE exploitation mode Only
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ac5e5a54
    • C
      powerpc/xive: introduce H_INT_ESB hcall · bed81ee1
      Cédric Le Goater 提交于
      The H_INT_ESB hcall() is used to issue a load or store to the ESB page
      instead of using the MMIO pages. This can be used as a workaround on
      some HW issues. The OS knows that this hcall should be used on an
      interrupt source when the ESB hcall flag is set to 1 in the hcall
      H_INT_GET_SOURCE_INFO.
      
      To maintain the frontier between the xive frontend and backend, we
      introduce a new xive operation 'esb_rw' to be used in the routines
      doing memory accesses on the ESBs.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bed81ee1
    • C
      powerpc/xive: add the HW IRQ number under xive_irq_data · c58a14a9
      Cédric Le Goater 提交于
      It will be required later by the H_INT_ESB hcall.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c58a14a9
    • C
      powerpc/xive: guest exploitation of the XIVE interrupt controller · eac1e731
      Cédric Le Goater 提交于
      This is the framework for using XIVE in a PowerVM guest. The support
      is very similar to the native one in a much simpler form.
      
      Each source is associated with an Event State Buffer (ESB). This is a
      two bit state machine which is used to trigger events. The bits are
      named "P" (pending) and "Q" (queued) and can be controlled by MMIO.
      The Guest OS registers event (or notifications) queues on which the HW
      will post event data for a target to notify.
      
      Instead of OPAL calls, a set of Hypervisors call are used to configure
      the interrupt sources and the event/notification queues of the guest:
      
       - H_INT_GET_SOURCE_INFO
      
         used to obtain the address of the MMIO page of the Event State
         Buffer (PQ bits) entry associated with the source.
      
       - H_INT_SET_SOURCE_CONFIG
      
         assigns a source to a "target".
      
       - H_INT_GET_SOURCE_CONFIG
      
         determines to which "target" and "priority" is assigned to a source
      
       - H_INT_GET_QUEUE_INFO
      
         returns the address of the notification management page associated
         with the specified "target" and "priority".
      
       - H_INT_SET_QUEUE_CONFIG
      
         sets or resets the event queue for a given "target" and "priority".
         It is also used to set the notification config associated with the
         queue, only unconditional notification for the moment.  Reset is
         performed with a queue size of 0 and queueing is disabled in that
         case.
      
       - H_INT_GET_QUEUE_CONFIG
      
         returns the queue settings for a given "target" and "priority".
      
       - H_INT_RESET
      
         resets all of the partition's interrupt exploitation structures to
         their initial state, losing all configuration set via the hcalls
         H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
      
       - H_INT_SYNC
      
         issue a synchronisation on a source to make sure sure all
         notifications have reached their queue.
      
      As for XICS, the XIVE interface for the guest is described in the
      device tree under the "interrupt-controller" node. A couple of new
      properties are specific to XIVE :
      
       - "reg"
      
         contains the base address and size of the thread interrupt
         managnement areas (TIMA), also called rings, for the User level and
         for the Guest OS level. Only the Guest OS level is taken into
         account today.
      
       - "ibm,xive-eq-sizes"
      
         the size of the event queues. One cell per size supported, contains
         log2 of size, in ascending order.
      
       - "ibm,xive-lisn-ranges"
      
         the interrupt numbers ranges assigned to the guest. These are
         allocated using a simple bitmap.
      
      and also :
      
       - "/ibm,plat-res-int-priorities"
      
         contains a list of priorities that the hypervisor has reserved for
         its own use.
      
      Tested with a QEMU XIVE model for pseries and with the Power hypervisor.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eac1e731
  17. 01 9月, 2017 6 次提交
    • H
      crypto/nx: Add P9 NX specific error codes for 842 engine · 146e9f1b
      Haren Myneni 提交于
      This patch adds changes for checking P9 specific 842 engine
      error codes. These errros are reported in coprocessor status
      block (CSB) for failures.
      Signed-off-by: NHaren Myneni <haren@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      146e9f1b
    • C
      powerpc/32: add memset16() · da74f659
      Christophe Leroy 提交于
      Commit 694fc88c ("powerpc/string: Implement optimized
      memset variants") added memset16(), memset32() and memset64()
      for the 64 bits PPC.
      
      On 32 bits, memset64() is not relevant, and as shown below,
      the generic version of memset32() gives a good code, so only
      memset16() is candidate for an optimised version.
      
      000009c0 <memset32>:
       9c0:   2c 05 00 00     cmpwi   r5,0
       9c4:   39 23 ff fc     addi    r9,r3,-4
       9c8:   4d 82 00 20     beqlr
       9cc:   7c a9 03 a6     mtctr   r5
       9d0:   94 89 00 04     stwu    r4,4(r9)
       9d4:   42 00 ff fc     bdnz    9d0 <memset32+0x10>
       9d8:   4e 80 00 20     blr
      
      The last part of memset() handling the not 4-bytes multiples
      operates on bytes, making it unsuitable for handling word without
      modification. As it would increase memset() complexity, it is
      better to implement memset16() from scratch. In addition it
      has the advantage of allowing a more optimised memset16() than what
      we would have by using the memset() function.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      da74f659
    • P
      powerpc: Emulate load/store floating point as integer word instructions · d2b65ac6
      Paul Mackerras 提交于
      This adds emulation for the lfiwax, lfiwzx and stfiwx instructions.
      This necessitated adding a new flag to indicate whether a floating
      point or an integer conversion was needed for LOAD_FP and STORE_FP,
      so this moves the size field in op->type up 4 bits.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d2b65ac6
    • P
      powerpc: Separate out load/store emulation into its own function · a53d5182
      Paul Mackerras 提交于
      This moves the parts of emulate_step() that deal with emulating
      load and store instructions into a new function called
      emulate_loadstore().  This is to make it possible to reuse this
      code in the alignment handler.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a53d5182
    • P
      powerpc: Handle opposite-endian processes in emulation code · d955189a
      Paul Mackerras 提交于
      This adds code to the load and store emulation code to byte-swap
      the data appropriately when the process being emulated is set to
      the opposite endianness to that of the kernel.
      
      This also enables the emulation for the multiple-register loads
      and stores (lmw, stmw, lswi, stswi, lswx, stswx) to work for
      little-endian.  In little-endian mode, the partial word at the
      end of a transfer for lsw*/stsw* (when the byte count is not a
      multiple of 4) is loaded/stored at the least-significant end of
      the register.  Additionally, this fixes a bug in the previous
      code in that it could call read_mem/write_mem with a byte count
      that was not 1, 2, 4 or 8.
      
      Note that this only works correctly on processors with "true"
      little-endian mode, such as IBM POWER processors from POWER6 on, not
      the so-called "PowerPC" little-endian mode that uses address swizzling
      as implemented on the old 32-bit 603, 604, 740/750, 74xx CPUs.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d955189a
    • P
      powerpc: Emulate the dcbz instruction · b2543f7b
      Paul Mackerras 提交于
      This adds code to analyse_instr() and emulate_step() to understand the
      dcbz (data cache block zero) instruction.  The emulate_dcbz() function
      is made public so it can be used by the alignment handler in future.
      (The apparently unnecessary cropping of the address to 32 bits is
      there because it will be needed in that situation.)
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b2543f7b