1. 01 9月, 2017 1 次提交
    • P
      powerpc: Use instruction emulation infrastructure to handle alignment faults · 31bfdb03
      Paul Mackerras 提交于
      This replaces almost all of the instruction emulation code in
      fix_alignment() with calls to analyse_instr(), emulate_loadstore()
      and emulate_dcbz().  The only emulation code left is the SPE
      emulation code; analyse_instr() etc. do not handle SPE instructions
      at present.
      
      One result of this is that we can now handle alignment faults on
      all the new VSX load and store instructions that were added in POWER9.
      VSX loads/stores will take alignment faults for unaligned accesses
      to cache-inhibited memory.
      
      Another effect is that we no longer rely on the DAR and DSISR values
      set by the processor.
      
      With this, we now need to include the instruction emulation code
      unconditionally.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      31bfdb03
  2. 19 8月, 2017 1 次提交
  3. 15 8月, 2017 3 次提交
  4. 14 8月, 2017 1 次提交
  5. 10 8月, 2017 3 次提交
  6. 13 7月, 2017 3 次提交
  7. 04 7月, 2017 1 次提交
    • B
      powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs · 1e0fc9d1
      Balbir Singh 提交于
      All code that patches kernel text has been moved over to using
      patch_instruction() and patch_instruction() is able to cope with the
      kernel text being read only.
      
      The linker script has been updated to ensure the read only data ends
      on a large page boundary, so it and the preceding kernel text can be
      marked R_X. We also have implementations of mark_rodata_ro() for Hash
      and Radix MMU modes.
      
      There are some corner-cases missing when the kernel is built
      relocatable, so for now make it depend on !RELOCATABLE.
      
      There's also a temporary workaround to depend on !HIBERNATION to avoid
      a build failure, that will be removed once we've merged with the PM
      tree.
      Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
      [mpe: Make it depend on !RELOCATABLE, munge change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1e0fc9d1
  8. 02 7月, 2017 1 次提交
  9. 30 6月, 2017 1 次提交
  10. 22 6月, 2017 1 次提交
    • P
      powerpc: Convert VDSO update function to use new update_vsyscall interface · d4cfb113
      Paul Mackerras 提交于
      This converts the powerpc VDSO time update function to use the new
      interface introduced in commit 576094b7 ("time: Introduce new
      GENERIC_TIME_VSYSCALL", 2012-09-11).  Where the old interface gave
      us the time as of the last update in seconds and whole nanoseconds,
      with the new interface we get the nanoseconds part effectively in
      a binary fixed-point format with tk->tkr_mono.shift bits to the
      right of the binary point.
      
      With the old interface, the fractional nanoseconds got truncated,
      meaning that the value returned by the VDSO clock_gettime function
      would have about 1ns of jitter in it compared to the value computed
      by the generic timekeeping code in the kernel.
      
      The powerpc VDSO time functions (clock_gettime and gettimeofday)
      already work in units of 2^-32 seconds, or 0.23283 ns, because that
      makes it simple to split the result into seconds and fractional
      seconds, and represent the fractional seconds in either microseconds
      or nanoseconds.  This is good enough accuracy for now, so this patch
      avoids changing how the VDSO works or the interface in the VDSO data
      page.
      
      This patch converts the powerpc update_vsyscall_old to be called
      update_vsyscall and use the new interface.  We convert the fractional
      second to units of 2^-32 seconds without truncating to whole nanoseconds.
      (There is still a conversion to whole nanoseconds for any legacy users
      of the vdso_data/systemcfg stamp_xtime field.)
      
      In addition, this improves the accuracy of the computation of tb_to_xs
      for those systems with high-frequency timebase clocks (>= 268.5 MHz)
      by doing the right shift in two parts, one before the multiplication and
      one after, rather than doing the right shift before the multiplication.
      (We can't do all of the right shift after the multiplication unless we
      use 128-bit arithmetic.)
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Acked-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d4cfb113
  11. 13 6月, 2017 1 次提交
  12. 09 6月, 2017 1 次提交
  13. 08 6月, 2017 1 次提交
    • M
      powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by default · c6ee9619
      Michael Ellerman 提交于
      The PPC_DT_CPU_FTRs is a bit misplaced in menuconfig, it shows up with
      other general kernel options. It's really more at home in the "Platform
      Support" section, so move it there.
      
      Also enable it by default, for Book3s 64. It does mostly nothing unless
      the device tree properties are found, and we will want it enabled
      eventually in distro kernels, so turn it on to start getting more
      testing.
      
      Fixes: 5a61ef74 ("powerpc/64s: Support new device tree binding for discovering CPU features")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c6ee9619
  14. 02 6月, 2017 1 次提交
    • C
      powerpc: Remove __ilog2()s and use generic ones · f782ddf2
      Christophe Leroy 提交于
      With the __ilog2() function as defined in
      arch/powerpc/include/asm/bitops.h, GCC will not optimise the code
      in case of constant parameter.
      
      The generic ilog2() function in include/linux/log2.h is written
      to handle the case of the constant parameter.
      
      This patch discards the three __ilog2() functions and
      defines __ilog2() as ilog2()
      
      For non constant calls, the generated code is doing the same:
      int test__ilog2(unsigned long x)
      {
      	return __ilog2(x);
      }
      
      int test__ilog2_u32(u32 n)
      {
      	return __ilog2_u32(n);
      }
      
      int test__ilog2_u64(u64 n)
      {
      	return __ilog2_u64(n);
      }
      
      On PPC32 before the patch:
      00000000 <test__ilog2>:
         0:	7c 63 00 34 	cntlzw  r3,r3
         4:	20 63 00 1f 	subfic  r3,r3,31
         8:	4e 80 00 20 	blr
      
      0000000c <test__ilog2_u32>:
         c:	7c 63 00 34 	cntlzw  r3,r3
        10:	20 63 00 1f 	subfic  r3,r3,31
        14:	4e 80 00 20 	blr
      
      On PPC32 after the patch:
      00000000 <test__ilog2>:
         0:	7c 63 00 34 	cntlzw  r3,r3
         4:	20 63 00 1f 	subfic  r3,r3,31
         8:	4e 80 00 20 	blr
      
      0000000c <test__ilog2_u32>:
         c:	7c 63 00 34 	cntlzw  r3,r3
        10:	20 63 00 1f 	subfic  r3,r3,31
        14:	4e 80 00 20 	blr
      
      On PPC64 before the patch:
      0000000000000000 <.test__ilog2>:
         0:	7c 63 00 74 	cntlzd  r3,r3
         4:	20 63 00 3f 	subfic  r3,r3,63
         8:	7c 63 07 b4 	extsw   r3,r3
         c:	4e 80 00 20 	blr
      
      0000000000000010 <.test__ilog2_u32>:
        10:	7c 63 00 34 	cntlzw  r3,r3
        14:	20 63 00 1f 	subfic  r3,r3,31
        18:	7c 63 07 b4 	extsw   r3,r3
        1c:	4e 80 00 20 	blr
      
      0000000000000020 <.test__ilog2_u64>:
        20:	7c 63 00 74 	cntlzd  r3,r3
        24:	20 63 00 3f 	subfic  r3,r3,63
        28:	7c 63 07 b4 	extsw   r3,r3
        2c:	4e 80 00 20 	blr
      
      On PPC64 after the patch:
      0000000000000000 <.test__ilog2>:
         0:	7c 63 00 74 	cntlzd  r3,r3
         4:	20 63 00 3f 	subfic  r3,r3,63
         8:	7c 63 07 b4 	extsw   r3,r3
         c:	4e 80 00 20 	blr
      
      0000000000000010 <.test__ilog2_u32>:
        10:	7c 63 00 34 	cntlzw  r3,r3
        14:	20 63 00 1f 	subfic  r3,r3,31
        18:	7c 63 07 b4 	extsw   r3,r3
        1c:	4e 80 00 20 	blr
      
      0000000000000020 <.test__ilog2_u64>:
        20:	7c 63 00 74 	cntlzd  r3,r3
        24:	20 63 00 3f 	subfic  r3,r3,63
        28:	7c 63 07 b4 	extsw   r3,r3
        2c:	4e 80 00 20 	blr
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f782ddf2
  15. 01 6月, 2017 1 次提交
  16. 30 5月, 2017 2 次提交
    • N
      powerpc/64: Handle linker stubs in low .text code · 951eedeb
      Nicholas Piggin 提交于
      Very large kernels may require linker stubs for branches from HEAD
      text code. The linker may place these stubs before the HEAD text
      sections, which breaks the assumption that HEAD text is located at 0
      (or the .text section being located at 0x7000/0x8000 on Book3S
      kernels).
      
      Provide an option to create a small section just before the .text
      section with an empty 256 - 4 bytes, and adjust the start of the .text
      section to match. The linker will tend to put stubs in that section
      and not break our relative-to-absolute offset assumptions.
      
      This causes a small waste of space on common kernels, but allows large
      kernels to build and boot. For now, it is an EXPERT config option,
      defaulting to =n, but a reference is provided for it in the build-time
      check for such breakage. This is good enough for allyesconfig and
      custom users / hackers.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      951eedeb
    • A
      powerpc: Add HAVE_IRQ_TIME_ACCOUNTING · 518470fe
      Anton Blanchard 提交于
      Allow us to enable IRQ_TIME_ACCOUNTING. Even though we currently
      use VIRT_CPU_ACCOUNTING_NATIVE, that option is quite heavy
      weight and IRQ_TIME_ACCOUNTING might be better in some cases.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      518470fe
  17. 09 5月, 2017 2 次提交
  18. 01 5月, 2017 1 次提交
  19. 28 4月, 2017 1 次提交
  20. 27 4月, 2017 2 次提交
  21. 24 4月, 2017 1 次提交
  22. 23 4月, 2017 1 次提交
    • I
      Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" · 6dd29b3d
      Ingo Molnar 提交于
      This reverts commit 2947ba05.
      
      Dan Williams reported dax-pmem kernel warnings with the following signature:
      
         WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1f5/0x200
         percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 (0) after switching to atomic
      
      ... and bisected it to this commit, which suggests possible memory corruption
      caused by the x86 fast-GUP conversion.
      
      He also pointed out:
      
       "
        This is similar to the backtrace when we were not properly handling
        pud faults and was fixed with this commit: 220ced16 "mm: fix
        get_user_pages() vs device-dax pud mappings"
      
        I've found some missing _devmap checks in the generic
        get_user_pages_fast() path, but this does not fix the regression
        [...]
       "
      
      So given that there are known bugs, and a pretty robust looking bisection
      points to this commit suggesting that are unknown bugs in the conversion
      as well, revert it for the time being - we'll re-try in v4.13.
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dann.frazier@canonical.com
      Cc: dave.hansen@intel.com
      Cc: steve.capper@linaro.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6dd29b3d
  23. 21 4月, 2017 1 次提交
    • M
      powerpc/mm: Add support for runtime configuration of ASLR limits · 9fea59bd
      Michael Ellerman 提交于
      Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two
      sysctls that allow a user to configure the number of bits of randomness used for
      ASLR.
      
      Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to
      construct at least the MIN value in Kconfig, vs in a header which would be more
      natural. Given that we just go ahead and do it all in Kconfig.
      
      At least according to the code (the documentation makes no mention of it), the
      value is defined as the number of bits of randomisation *of the page*, not the
      address. This makes some sense, with larger page sizes more of the low bits are
      forced to zero, which would reduce the randomisation if we didn't take the
      PAGE_SIZE into account. However it does mean the min/max values have to change
      depending on the PAGE_SIZE in order to actually limit the amount of address
      space consumed by the randomisation.
      
      The result of that is that we have to define the default values based on both
      32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we
      have 128TB address space support on Book3S, we also have to take that into
      account.
      
      Finally we can wire up the value in arch_mmap_rnd().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBhupesh Sharma <bhsharma@redhat.com>
      Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      9fea59bd
  24. 19 4月, 2017 1 次提交
  25. 11 4月, 2017 1 次提交
  26. 07 4月, 2017 1 次提交
  27. 20 3月, 2017 1 次提交
  28. 18 3月, 2017 1 次提交
  29. 06 3月, 2017 1 次提交
    • M
      powerpc: Sort the selects under CONFIG_PPC · a7d2475a
      Michael Ellerman 提交于
      We have a big list of selects under CONFIG_PPC, and currently they're
      completely unsorted. This means people tend to add new selects at the
      bottom of the list, and so two commits which both add a new select will
      often conflict.
      
      Instead sort it alphabetically. This is nicer in and of itself, but also
      means two commits that add a new select will have a greater chance of
      not conflicting.
      
      Add a note at the top and bottom asking people to keep it sorted.
      
      And while we're here pad out the 'if' expressions to make them stand
      out.
      Suggested-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7d2475a
  30. 10 2月, 2017 2 次提交
    • A
      powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL · 496e9cb5
      Anton Blanchard 提交于
      The final paragraph of the help text is reversed. We want to enable
      this option by default, and disable it if the toolchain has a working
      -mprofile-kernel.
      
      Fixes: 8c50b72a ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel")
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      496e9cb5
    • A
      powerpc/kprobes: Implement Optprobes · 51c9c084
      Anju T 提交于
      Current infrastructure of kprobe uses the unconditional trap instruction
      to probe a running kernel. Optprobe allows kprobe to replace the trap
      with a branch instruction to a detour buffer. Detour buffer contains
      instructions to create an in memory pt_regs. Detour buffer also has a
      call to optimized_callback() which in turn call the pre_handler(). After
      the execution of the pre-handler, a call is made for instruction
      emulation. The NIP is determined in advanced through dummy instruction
      emulation and a branch instruction is created to the NIP at the end of
      the trampoline.
      
      To address the limitation of branch instruction in POWER architecture,
      detour buffer slot is allocated from a reserved area. For the time
      being, 64KB is reserved in memory for this purpose.
      
      Instructions which can be emulated using analyse_instr() are the
      candidates for optimization. Before optimization ensure that the address
      range between the detour buffer allocated and the instruction being
      probed is within +/- 32MB.
      Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      51c9c084