1. 30 7月, 2018 6 次提交
  2. 24 7月, 2018 10 次提交
    • N
      powerpc/powernv: implement opal_put_chars_atomic · 17cc1dd4
      Nicholas Piggin 提交于
      The RAW console does not need writes to be atomic, so relax
      opal_put_chars to be able to do partial writes, and implement an
      _atomic variant which does not take a spinlock. This API is used
      in xmon, so the less locking that is used, the better chance there
      is that a crash can be debugged.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      17cc1dd4
    • N
      powerpc/powernv: Implement and use opal_flush_console · d2a2262e
      Nicholas Piggin 提交于
      A new console flushing firmware API was introduced to replace event
      polling loops, and implemented in opal-kmsg with affddff6
      ("powerpc/powernv: Add a kmsg_dumper that flushes console output on
      panic"), to flush the console in the panic path.
      
      The OPAL console driver has other situations where interrupts are off
      and it needs to flush the console synchronously. These still use a
      polling loop.
      
      So move the opal-kmsg flush code to opal_flush_console, and use the
      new function in opal-kmsg and opal_put_chars.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Reviewed-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d2a2262e
    • S
      powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision · d58badfb
      Simon Guo 提交于
      This patch add VMX primitives to do memcmp() in case the compare size
      is equal or greater than 4K bytes. KSM feature can benefit from this.
      
      Test result with following test program(replace the "^>" with ""):
      ------
      ># cat tools/testing/selftests/powerpc/stringloops/memcmp.c
      >#include <malloc.h>
      >#include <stdlib.h>
      >#include <string.h>
      >#include <time.h>
      >#include "utils.h"
      >#define SIZE (1024 * 1024 * 900)
      >#define ITERATIONS 40
      
      int test_memcmp(const void *s1, const void *s2, size_t n);
      
      static int testcase(void)
      {
              char *s1;
              char *s2;
              unsigned long i;
      
              s1 = memalign(128, SIZE);
              if (!s1) {
                      perror("memalign");
                      exit(1);
              }
      
              s2 = memalign(128, SIZE);
              if (!s2) {
                      perror("memalign");
                      exit(1);
              }
      
              for (i = 0; i < SIZE; i++)  {
                      s1[i] = i & 0xff;
                      s2[i] = i & 0xff;
              }
              for (i = 0; i < ITERATIONS; i++) {
      		int ret = test_memcmp(s1, s2, SIZE);
      
      		if (ret) {
      			printf("return %d at[%ld]! should have returned zero\n", ret, i);
      			abort();
      		}
      	}
      
              return 0;
      }
      
      int main(void)
      {
              return test_harness(testcase, "memcmp");
      }
      ------
      Without this patch (but with the first patch "powerpc/64: Align bytes
      before fall back to .Lshort in powerpc64 memcmp()." in the series):
      	4.726728762 seconds time elapsed                                          ( +-  3.54%)
      With VMX patch:
      	4.234335473 seconds time elapsed                                          ( +-  2.63%)
      		There is ~+10% improvement.
      
      Testing with unaligned and different offset version (make s1 and s2 shift
      random offset within 16 bytes) can archieve higher improvement than 10%..
      Signed-off-by: NSimon Guo <wei.guo.simon@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d58badfb
    • S
      powerpc: add vcmpequd/vcmpequb ppc instruction macro · f1ecbaf4
      Simon Guo 提交于
      Some old tool chains don't know about instructions like vcmpequd.
      
      This patch adds .long macro for vcmpequd and vcmpequb, which is
      a preparation to optimize ppc64 memcmp with VMX instructions.
      Signed-off-by: NSimon Guo <wei.guo.simon@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f1ecbaf4
    • A
      a833280b
    • A
      powerpc/mm: Increase MAX_PHYSMEM_BITS to 128TB with SPARSEMEM_VMEMMAP config · 7d4340bb
      Aneesh Kumar K.V 提交于
      We do this only with VMEMMAP config so that our page_to_[nid/section] etc are not
      impacted.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7d4340bb
    • N
      powerpc: NMI IPI make NMI IPIs fully sychronous · 5b73151f
      Nicholas Piggin 提交于
      There is an asynchronous aspect to smp_send_nmi_ipi. The caller waits
      for all CPUs to call in to the handler, but it does not wait for
      completion of the handler. This is a needless complication, so remove
      it and always wait synchronously.
      
      The synchronous wait allows the caller to easily time out and clear
      the wait for completion (zero nmi_ipi_busy_count) in the case of badly
      behaved handlers. This would have prevented the recent smp_send_stop
      NMI IPI bug from causing the system to hang.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5b73151f
    • N
      powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely · 9b81c021
      Nicholas Piggin 提交于
      When the masked interrupt handler clears MSR[EE] for an interrupt in
      the PACA_IRQ_MUST_HARD_MASK set, it does not set PACA_IRQ_HARD_DIS.
      This makes them get out of synch.
      
      With that taken into account, it's only low level irq manipulation
      (and interrupt entry before reconcile) where they can be out of synch.
      This makes the code less surprising.
      
      It also allows the IRQ replay code to rely on the IRQ_HARD_DIS value
      and not have to mtmsrd again in this case (e.g., for an external
      interrupt that has been masked). The bigger benefit might just be
      that there is not such an element of surprise in these two bits of
      state.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9b81c021
    • R
      powerpc/pkeys: make protection key 0 less special · 07f522d2
      Ram Pai 提交于
      Applications need the ability to associate an address-range with some
      key and latter revert to its initial default key. Pkey-0 comes close to
      providing this function but falls short, because the current
      implementation disallows applications to explicitly associate pkey-0 to
      the address range.
      
      Lets make pkey-0 less special and treat it almost like any other key.
      Thus it can be explicitly associated with any address range, and can be
      freed. This gives the application more flexibility and power.  The
      ability to free pkey-0 must be used responsibily, since pkey-0 is
      associated with almost all address-range by default.
      
      Even with this change pkey-0 continues to be slightly more special
      from the following point of view.
      (a) it is implicitly allocated.
      (b) it is the default key assigned to any address-range.
      (c) its permissions cannot be modified by userspace.
      
      NOTE: (c) is specific to powerpc only. pkey-0 is associated by default
      with all pages including kernel pages, and pkeys are also active in
      kernel mode. If any permission is denied on pkey-0, the kernel running
      in the context of the application will be unable to operate.
      
      Tested on powerpc.
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      [mpe: Drop #define PKEY_0 0 in favour of plain old 0]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      07f522d2
    • R
      powerpc/pkeys: key allocation/deallocation must not change pkey registers · 4a4a5e5d
      Ram Pai 提交于
      Key allocation and deallocation has the side effect of programming the
      UAMOR/AMR/IAMR registers. This is wrong, since its the responsibility of
      the application and not that of the kernel, to modify the permission on
      the key.
      
      Do not modify the pkey registers at key allocation/deallocation.
      
      This patch also fixes a bug where a sys_pkey_free() resets the UAMOR
      bits of the key, thus making its permissions unmodifiable from user
      space. Later if the same key gets reallocated from a different thread
      this thread will no longer be able to change the permissions on the key.
      
      Fixes: cf43d3b2 ("powerpc: Enable pkey subsystem")
      Cc: stable@vger.kernel.org # v4.16+
      Reviewed-by: NThiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4a4a5e5d
  3. 16 7月, 2018 4 次提交
    • A
      powerpc/powernv/ioda: Allocate indirect TCE levels on demand · a68bd126
      Alexey Kardashevskiy 提交于
      At the moment we allocate the entire TCE table, twice (hardware part and
      userspace translation cache). This normally works as we normally have
      contigous memory and the guest will map entire RAM for 64bit DMA.
      
      However if we have sparse RAM (one example is a memory device), then
      we will allocate TCEs which will never be used as the guest only maps
      actual memory for DMA. If it is a single level TCE table, there is nothing
      we can really do but if it a multilevel table, we can skip allocating
      TCEs we know we won't need.
      
      This adds ability to allocate only first level, saving memory.
      
      This changes iommu_table::free() to avoid allocating of an extra level;
      iommu_table::set() will do this when needed.
      
      This adds @alloc parameter to iommu_table::exchange() to tell the callback
      if it can allocate an extra level; the flag is set to "false" for
      the realmode KVM handlers of H_PUT_TCE hcalls and the callback returns
      H_TOO_HARD.
      
      This still requires the entire table to be counted in mm::locked_vm.
      
      To be conservative, this only does on-demand allocation when
      the usespace cache table is requested which is the case of VFIO.
      
      The example math for a system replicating a powernv setup with NVLink2
      in a guest:
      16GB RAM mapped at 0x0
      128GB GPU RAM window (16GB of actual RAM) mapped at 0x244000000000
      
      the table to cover that all with 64K pages takes:
      (((0x244000000000 + 0x2000000000) >> 16)*8)>>20 = 4556MB
      
      If we allocate only necessary TCE levels, we will only need:
      (((0x400000000 + 0x400000000) >> 16)*8)>>20 = 4MB (plus some for indirect
      levels).
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a68bd126
    • A
      powerpc/powernv: Add indirect levels to it_userspace · 090bad39
      Alexey Kardashevskiy 提交于
      We want to support sparse memory and therefore huge chunks of DMA windows
      do not need to be mapped. If a DMA window big enough to require 2 or more
      indirect levels, and a DMA window is used to map all RAM (which is
      a default case for 64bit window), we can actually save some memory by
      not allocation TCE for regions which we are not going to map anyway.
      
      The hardware tables alreary support indirect levels but we also keep
      host-physical-to-userspace translation array which is allocated by
      vmalloc() and is a flat array which might use quite some memory.
      
      This converts it_userspace from vmalloc'ed array to a multi level table.
      
      As the format becomes platform dependend, this replaces the direct access
      to it_usespace with a iommu_table_ops::useraddrptr hook which returns
      a pointer to the userspace copy of a TCE; future extension will return
      NULL if the level was not allocated.
      
      This should not change non-KVM handling of TCE tables and it_userspace
      will not be allocated for non-KVM tables.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      090bad39
    • A
      KVM: PPC: Make iommu_table::it_userspace big endian · 00a5c58d
      Alexey Kardashevskiy 提交于
      We are going to reuse multilevel TCE code for the userspace copy of
      the TCE table and since it is big endian, let's make the copy big endian
      too.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      00a5c58d
    • N
      powerpc/64s: Remove POWER9 DD1 support · 2bf1071a
      Nicholas Piggin 提交于
      POWER9 DD1 was never a product. It is no longer supported by upstream
      firmware, and it is not effectively supported in Linux due to lack of
      testing.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NMichael Ellerman <mpe@ellerman.id.au>
      [mpe: Remove arch_make_huge_pte() entirely]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2bf1071a
  4. 12 7月, 2018 1 次提交
  5. 02 7月, 2018 2 次提交
  6. 26 6月, 2018 1 次提交
  7. 23 6月, 2018 1 次提交
  8. 20 6月, 2018 1 次提交
  9. 19 6月, 2018 1 次提交
  10. 11 6月, 2018 1 次提交
  11. 08 6月, 2018 1 次提交
    • L
      mm: introduce ARCH_HAS_PTE_SPECIAL · 3010a5ea
      Laurent Dufour 提交于
      Currently the PTE special supports is turned on in per architecture
      header files.  Most of the time, it is defined in
      arch/*/include/asm/pgtable.h depending or not on some other per
      architecture static definition.
      
      This patch introduce a new configuration variable to manage this
      directly in the Kconfig files.  It would later replace
      __HAVE_ARCH_PTE_SPECIAL.
      
      Here notes for some architecture where the definition of
      __HAVE_ARCH_PTE_SPECIAL is not obvious:
      
      arm
       __HAVE_ARCH_PTE_SPECIAL which is currently defined in
      arch/arm/include/asm/pgtable-3level.h which is included by
      arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set.
      So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE.
      
      powerpc
      __HAVE_ARCH_PTE_SPECIAL is defined in 2 files:
       - arch/powerpc/include/asm/book3s/64/pgtable.h
       - arch/powerpc/include/asm/pte-common.h
      The first one is included if (PPC_BOOK3S & PPC64) while the second is
      included in all the other cases.
      So select ARCH_HAS_PTE_SPECIAL all the time.
      
      sparc:
      __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) &&
      defined(__arch64__) which are defined through the compiler in
      sparc/Makefile if !SPARC32 which I assume to be if SPARC64.
      So select ARCH_HAS_PTE_SPECIAL if SPARC64
      
      There is no functional change introduced by this patch.
      
      Link: http://lkml.kernel.org/r/1523433816-14460-2-git-send-email-ldufour@linux.vnet.ibm.comSigned-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Suggested-by: NJerome Glisse <jglisse@redhat.com>
      Reviewed-by: NJerome Glisse <jglisse@redhat.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <albert@sifive.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Christophe LEROY <christophe.leroy@c-s.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3010a5ea
  12. 06 6月, 2018 2 次提交
    • B
      powerpc: Wire up restartable sequences system call · bb862b02
      Boqun Feng 提交于
      Wire up the rseq system call on powerpc.
      
      This provides an ABI improving the speed of a user-space getcpu
      operation on powerpc by skipping the getcpu system call on the fast
      path, as well as improving the speed of user-space operations on per-cpu
      data compared to using load-reservation/store-conditional atomics.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Chris Lameter <cl@linux.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ben Maurer <bmaurer@fb.com>
      Cc: linux-api@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lkml.kernel.org/r/20180602124408.8430-11-mathieu.desnoyers@efficios.com
      bb862b02
    • N
      powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap · ff5bc793
      Nicholas Piggin 提交于
      There is a typo in f1cb8f9b ("powerpc/64s/radix: avoid ptesync after
      set_pte and ptep_set_access_flags") config ifdef, which results in the
      necessary ptesync not being issued after vmalloc.
      
      This causes random kernel faults in module load, bpf load, anywhere
      that vmalloc mappings are used.
      
      After correcting the code, this survives a guest kernel booting
      hundreds of times where previously there would be a crash every few
      boots (I haven't noticed the crash on host, perhaps due to different
      TLB and page table walking behaviour in hardware).
      
      A memory clobber is also added to the flush, just to be sure it won't
      be reordered with the pte set or the subsequent mapping access.
      
      Fixes: f1cb8f9b ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ff5bc793
  13. 03 6月, 2018 9 次提交
    • C
      powerpc/time: inline arch_vtime_task_switch() · 60f1d289
      Christophe Leroy 提交于
      arch_vtime_task_switch() is a small function which is called
      only from vtime_common_task_switch(), so it is worth inlining
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      60f1d289
    • C
      powerpc: Implement csum_ipv6_magic in assembly · e9c4943a
      Christophe Leroy 提交于
      The generic csum_ipv6_magic() generates a pretty bad result
      
      00000000 <csum_ipv6_magic>: (PPC32)
         0:	81 23 00 00 	lwz     r9,0(r3)
         4:	81 03 00 04 	lwz     r8,4(r3)
         8:	7c e7 4a 14 	add     r7,r7,r9
         c:	7d 29 38 10 	subfc   r9,r9,r7
        10:	7d 4a 51 10 	subfe   r10,r10,r10
        14:	7d 27 42 14 	add     r9,r7,r8
        18:	7d 2a 48 50 	subf    r9,r10,r9
        1c:	80 e3 00 08 	lwz     r7,8(r3)
        20:	7d 08 48 10 	subfc   r8,r8,r9
        24:	7d 4a 51 10 	subfe   r10,r10,r10
        28:	7d 29 3a 14 	add     r9,r9,r7
        2c:	81 03 00 0c 	lwz     r8,12(r3)
        30:	7d 2a 48 50 	subf    r9,r10,r9
        34:	7c e7 48 10 	subfc   r7,r7,r9
        38:	7d 4a 51 10 	subfe   r10,r10,r10
        3c:	7d 29 42 14 	add     r9,r9,r8
        40:	7d 2a 48 50 	subf    r9,r10,r9
        44:	80 e4 00 00 	lwz     r7,0(r4)
        48:	7d 08 48 10 	subfc   r8,r8,r9
        4c:	7d 4a 51 10 	subfe   r10,r10,r10
        50:	7d 29 3a 14 	add     r9,r9,r7
        54:	7d 2a 48 50 	subf    r9,r10,r9
        58:	81 04 00 04 	lwz     r8,4(r4)
        5c:	7c e7 48 10 	subfc   r7,r7,r9
        60:	7d 4a 51 10 	subfe   r10,r10,r10
        64:	7d 29 42 14 	add     r9,r9,r8
        68:	7d 2a 48 50 	subf    r9,r10,r9
        6c:	80 e4 00 08 	lwz     r7,8(r4)
        70:	7d 08 48 10 	subfc   r8,r8,r9
        74:	7d 4a 51 10 	subfe   r10,r10,r10
        78:	7d 29 3a 14 	add     r9,r9,r7
        7c:	7d 2a 48 50 	subf    r9,r10,r9
        80:	81 04 00 0c 	lwz     r8,12(r4)
        84:	7c e7 48 10 	subfc   r7,r7,r9
        88:	7d 4a 51 10 	subfe   r10,r10,r10
        8c:	7d 29 42 14 	add     r9,r9,r8
        90:	7d 2a 48 50 	subf    r9,r10,r9
        94:	7d 08 48 10 	subfc   r8,r8,r9
        98:	7d 4a 51 10 	subfe   r10,r10,r10
        9c:	7d 29 2a 14 	add     r9,r9,r5
        a0:	7d 2a 48 50 	subf    r9,r10,r9
        a4:	7c a5 48 10 	subfc   r5,r5,r9
        a8:	7c 63 19 10 	subfe   r3,r3,r3
        ac:	7d 29 32 14 	add     r9,r9,r6
        b0:	7d 23 48 50 	subf    r9,r3,r9
        b4:	7c c6 48 10 	subfc   r6,r6,r9
        b8:	7c 63 19 10 	subfe   r3,r3,r3
        bc:	7c 63 48 50 	subf    r3,r3,r9
        c0:	54 6a 80 3e 	rotlwi  r10,r3,16
        c4:	7c 63 52 14 	add     r3,r3,r10
        c8:	7c 63 18 f8 	not     r3,r3
        cc:	54 63 84 3e 	rlwinm  r3,r3,16,16,31
        d0:	4e 80 00 20 	blr
      
      0000000000000000 <.csum_ipv6_magic>: (PPC64)
         0:	81 23 00 00 	lwz     r9,0(r3)
         4:	80 03 00 04 	lwz     r0,4(r3)
         8:	81 63 00 08 	lwz     r11,8(r3)
         c:	7c e7 4a 14 	add     r7,r7,r9
        10:	7f 89 38 40 	cmplw   cr7,r9,r7
        14:	7d 47 02 14 	add     r10,r7,r0
        18:	7d 30 10 26 	mfocrf  r9,1
        1c:	55 29 f7 fe 	rlwinm  r9,r9,30,31,31
        20:	7d 4a 4a 14 	add     r10,r10,r9
        24:	7f 80 50 40 	cmplw   cr7,r0,r10
        28:	7d 2a 5a 14 	add     r9,r10,r11
        2c:	80 03 00 0c 	lwz     r0,12(r3)
        30:	81 44 00 00 	lwz     r10,0(r4)
        34:	7d 10 10 26 	mfocrf  r8,1
        38:	55 08 f7 fe 	rlwinm  r8,r8,30,31,31
        3c:	7d 29 42 14 	add     r9,r9,r8
        40:	81 04 00 04 	lwz     r8,4(r4)
        44:	7f 8b 48 40 	cmplw   cr7,r11,r9
        48:	7d 29 02 14 	add     r9,r9,r0
        4c:	7d 70 10 26 	mfocrf  r11,1
        50:	55 6b f7 fe 	rlwinm  r11,r11,30,31,31
        54:	7d 29 5a 14 	add     r9,r9,r11
        58:	7f 80 48 40 	cmplw   cr7,r0,r9
        5c:	7d 29 52 14 	add     r9,r9,r10
        60:	7c 10 10 26 	mfocrf  r0,1
        64:	54 00 f7 fe 	rlwinm  r0,r0,30,31,31
        68:	7d 69 02 14 	add     r11,r9,r0
        6c:	7f 8a 58 40 	cmplw   cr7,r10,r11
        70:	7c 0b 42 14 	add     r0,r11,r8
        74:	81 44 00 08 	lwz     r10,8(r4)
        78:	7c f0 10 26 	mfocrf  r7,1
        7c:	54 e7 f7 fe 	rlwinm  r7,r7,30,31,31
        80:	7c 00 3a 14 	add     r0,r0,r7
        84:	7f 88 00 40 	cmplw   cr7,r8,r0
        88:	7d 20 52 14 	add     r9,r0,r10
        8c:	80 04 00 0c 	lwz     r0,12(r4)
        90:	7d 70 10 26 	mfocrf  r11,1
        94:	55 6b f7 fe 	rlwinm  r11,r11,30,31,31
        98:	7d 29 5a 14 	add     r9,r9,r11
        9c:	7f 8a 48 40 	cmplw   cr7,r10,r9
        a0:	7d 29 02 14 	add     r9,r9,r0
        a4:	7d 70 10 26 	mfocrf  r11,1
        a8:	55 6b f7 fe 	rlwinm  r11,r11,30,31,31
        ac:	7d 29 5a 14 	add     r9,r9,r11
        b0:	7f 80 48 40 	cmplw   cr7,r0,r9
        b4:	7d 29 2a 14 	add     r9,r9,r5
        b8:	7c 10 10 26 	mfocrf  r0,1
        bc:	54 00 f7 fe 	rlwinm  r0,r0,30,31,31
        c0:	7d 29 02 14 	add     r9,r9,r0
        c4:	7f 85 48 40 	cmplw   cr7,r5,r9
        c8:	7c 09 32 14 	add     r0,r9,r6
        cc:	7d 50 10 26 	mfocrf  r10,1
        d0:	55 4a f7 fe 	rlwinm  r10,r10,30,31,31
        d4:	7c 00 52 14 	add     r0,r0,r10
        d8:	7f 80 30 40 	cmplw   cr7,r0,r6
        dc:	7d 30 10 26 	mfocrf  r9,1
        e0:	55 29 ef fe 	rlwinm  r9,r9,29,31,31
        e4:	7c 09 02 14 	add     r0,r9,r0
        e8:	54 03 80 3e 	rotlwi  r3,r0,16
        ec:	7c 03 02 14 	add     r0,r3,r0
        f0:	7c 03 00 f8 	not     r3,r0
        f4:	78 63 84 22 	rldicl  r3,r3,48,48
        f8:	4e 80 00 20 	blr
      
      This patch implements it in assembly for both PPC32 and PPC64
      
      Link: https://github.com/linuxppc/linux/issues/9Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NSegher Boessenkool <segher@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e9c4943a
    • C
      powerpc/lib: Adjust .balign inside string functions for PPC32 · 1128bb78
      Christophe Leroy 提交于
      commit 87a156fb ("Align hot loops of some string functions")
      degraded the performance of string functions by adding useless
      nops
      
      A simple benchmark on an 8xx calling 100000x a memchr() that
      matches the first byte runs in 41668 TB ticks before this patch
      and in 35986 TB ticks after this patch. So this gives an
      improvement of approx 10%
      
      Another benchmark doing the same with a memchr() matching the 128th
      byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks
      after this patch, so regardless on the number of loops, removing
      those useless nops improves the test by 5683 TB ticks.
      
      Fixes: 87a156fb ("Align hot loops of some string functions")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1128bb78
    • C
      powerpc/64: optimises from64to32() · 55a0edf0
      Christophe Leroy 提交于
      The current implementation of from64to32() gives a poor result:
      
      0000000000000270 <.from64to32>:
       270:	38 00 ff ff 	li      r0,-1
       274:	78 69 00 22 	rldicl  r9,r3,32,32
       278:	78 00 00 20 	clrldi  r0,r0,32
       27c:	7c 60 00 38 	and     r0,r3,r0
       280:	7c 09 02 14 	add     r0,r9,r0
       284:	78 09 00 22 	rldicl  r9,r0,32,32
       288:	7c 00 4a 14 	add     r0,r0,r9
       28c:	78 03 00 20 	clrldi  r3,r0,32
       290:	4e 80 00 20 	blr
      
      This patch modifies from64to32() to operate in the same
      spirit as csum_fold()
      
      It swaps the two 32-bit halves of sum then it adds it with the
      unswapped sum. If there is a carry from adding the two 32-bit halves,
      it will carry from the lower half into the upper half, giving us the
      correct sum in the upper half.
      
      The resulting code is:
      
      0000000000000260 <.from64to32>:
       260:	78 60 00 02 	rotldi  r0,r3,32
       264:	7c 60 1a 14 	add     r3,r0,r3
       268:	78 63 00 22 	rldicl  r3,r3,32,32
       26c:	4e 80 00 20 	blr
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      55a0edf0
    • R
      powerpc/sstep: Introduce GETTYPE macro · e6684d07
      Ravi Bangoria 提交于
      Replace 'op->type & INSTR_TYPE_MASK' expression with GETTYPE(op->type)
      macro.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e6684d07
    • M
      powerpc: Use barrier_nospec in copy_from_user() · ddf35cf3
      Michael Ellerman 提交于
      Based on the x86 commit doing the same.
      
      See commit 304ec1b0 ("x86/uaccess: Use __uaccess_begin_nospec()
      and uaccess_try_nospec") and b3bbfb3f ("x86: Introduce
      __uaccess_begin_nospec() and uaccess_try_nospec") for more detail.
      
      In all cases we are ordering the load from the potentially
      user-controlled pointer vs a previous branch based on an access_ok()
      check or similar.
      
      Base on a patch from Michal Suchanek.
      Signed-off-by: NMichal Suchanek <msuchanek@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ddf35cf3
    • M
      powerpc/64s: Enable barrier_nospec based on firmware settings · cb3d6759
      Michal Suchanek 提交于
      Check what firmware told us and enable/disable the barrier_nospec as
      appropriate.
      
      We err on the side of enabling the barrier, as it's no-op on older
      systems, see the comment for more detail.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      cb3d6759
    • M
      powerpc/64s: Patch barrier_nospec in modules · 815069ca
      Michal Suchanek 提交于
      Note that unlike RFI which is patched only in kernel the nospec state
      reflects settings at the time the module was loaded.
      
      Iterating all modules and re-patching every time the settings change
      is not implemented.
      
      Based on lwsync patching.
      Signed-off-by: NMichal Suchanek <msuchanek@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      815069ca
    • M
      powerpc/64s: Add support for ori barrier_nospec patching · 2eea7f06
      Michal Suchanek 提交于
      Based on the RFI patching. This is required to be able to disable the
      speculation barrier.
      
      Only one barrier type is supported and it does nothing when the
      firmware does not enable it. Also re-patching modules is not supported
      So the only meaningful thing that can be done is patching out the
      speculation barrier at boot when the user says it is not wanted.
      Signed-off-by: NMichal Suchanek <msuchanek@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2eea7f06