1. 29 11月, 2021 2 次提交
  2. 27 10月, 2021 2 次提交
  3. 22 10月, 2021 5 次提交
    • C
      powerpc: Activate CONFIG_STRICT_KERNEL_RWX by default · fdacae8a
      Christophe Leroy 提交于
      CONFIG_STRICT_KERNEL_RWX should be set by default on every
      architectures (See https://github.com/KSPP/linux/issues/4)
      
      On PPC32 we have to find a compromise between performance and/or
      memory wasting and selection of strict_kernel_rwx, because it implies
      either smaller memory chunks or larger alignment between RO memory
      and RW memory.
      
      For instance the 8xx maps memory with 8M pages. So either the limit
      between RO and RW must be 8M aligned or it falls back or 512k pages
      which implies more pressure on the TLB.
      
      book3s/32 maps memory with BATs as much as possible. BATS can have
      any power-of-two size between 128k and 256M but we have only 4 to 8
      BATs so the alignment must be good enough to allow efficient use of
      the BATs and avoid falling back on standard page mapping which would
      kill performance.
      
      So let's go one step forward and make it the default but still allow
      users to unset it when wanted.
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/057c40164084bfc7d77c0b2ff78d95dbf6a2a21b.1632503622.git.christophe.leroy@csgroup.eu
      fdacae8a
    • C
      powerpc/32: Add support for out-of-line static calls · 5c810ced
      Christophe Leroy 提交于
      Add support for out-of-line static calls on PPC32. This change
      improve performance of calls to global function pointers by
      using direct calls instead of indirect calls.
      
      The trampoline is initialy populated with a 'blr' or branch to target,
      followed by an unreachable long jump sequence.
      
      In order to cater with parallele execution, the trampoline needs to
      be updated in a way that ensures it remains consistent at all time.
      This means we can't use the traditional lis/addi to load r12 with
      the target address, otherwise there would be a window during which
      the first instruction contains the upper part of the new target
      address while the second instruction still contains the lower part of
      the old target address. To avoid that the target address is stored
      just after the 'bctr' and loaded from there with a single instruction.
      
      Then, depending on the target distance, arch_static_call_transform()
      will either replace the first instruction by a direct 'bl <target>' or
      'nop' in order to have the trampoline fall through the long jump
      sequence.
      
      For the special case of __static_call_return0(), to avoid the risk of
      a far branch, a version of it is inlined at the end of the trampoline.
      
      Performancewise the long jump sequence is probably not better than
      the indirect calls set by GCC when we don't use static calls, but
      such calls are unlikely to be required on powerpc32: With most
      configurations the kernel size is far below 32 Mbytes so only
      modules may happen to be too far. And even modules are likely to
      be close enough as they are allocated below the kernel core and
      as close as possible of the kernel text.
      
      static_call selftest is running successfully with this change.
      
      With this patch, __do_irq() has the following sequence to trace
      irq entries:
      
      	c0004a00 <__SCT__tp_func_irq_entry>:
      	c0004a00:	48 00 00 e0 	b       c0004ae0 <__traceiter_irq_entry>
      	c0004a04:	3d 80 c0 00 	lis     r12,-16384
      	c0004a08:	81 8c 4a 1c 	lwz     r12,18972(r12)
      	c0004a0c:	7d 89 03 a6 	mtctr   r12
      	c0004a10:	4e 80 04 20 	bctr
      	c0004a14:	38 60 00 00 	li      r3,0
      	c0004a18:	4e 80 00 20 	blr
      	c0004a1c:	00 00 00 00 	.long 0x0
      ...
      	c0005654 <__do_irq>:
      ...
      	c0005664:	7c 7f 1b 78 	mr      r31,r3
      ...
      	c00056a0:	81 22 00 00 	lwz     r9,0(r2)
      	c00056a4:	39 29 00 01 	addi    r9,r9,1
      	c00056a8:	91 22 00 00 	stw     r9,0(r2)
      	c00056ac:	3d 20 c0 af 	lis     r9,-16209
      	c00056b0:	81 29 74 cc 	lwz     r9,29900(r9)
      	c00056b4:	2c 09 00 00 	cmpwi   r9,0
      	c00056b8:	41 82 00 10 	beq     c00056c8 <__do_irq+0x74>
      	c00056bc:	80 69 00 04 	lwz     r3,4(r9)
      	c00056c0:	7f e4 fb 78 	mr      r4,r31
      	c00056c4:	4b ff f3 3d 	bl      c0004a00 <__SCT__tp_func_irq_entry>
      
      Before this patch, __do_irq() was doing the following to trace irq
      entries:
      
      	c0005700 <__do_irq>:
      ...
      	c0005710:	7c 7e 1b 78 	mr      r30,r3
      ...
      	c000574c:	93 e1 00 0c 	stw     r31,12(r1)
      	c0005750:	81 22 00 00 	lwz     r9,0(r2)
      	c0005754:	39 29 00 01 	addi    r9,r9,1
      	c0005758:	91 22 00 00 	stw     r9,0(r2)
      	c000575c:	3d 20 c0 af 	lis     r9,-16209
      	c0005760:	83 e9 f4 cc 	lwz     r31,-2868(r9)
      	c0005764:	2c 1f 00 00 	cmpwi   r31,0
      	c0005768:	41 82 00 24 	beq     c000578c <__do_irq+0x8c>
      	c000576c:	81 3f 00 00 	lwz     r9,0(r31)
      	c0005770:	80 7f 00 04 	lwz     r3,4(r31)
      	c0005774:	7d 29 03 a6 	mtctr   r9
      	c0005778:	7f c4 f3 78 	mr      r4,r30
      	c000577c:	4e 80 04 21 	bctrl
      	c0005780:	85 3f 00 0c 	lwzu    r9,12(r31)
      	c0005784:	2c 09 00 00 	cmpwi   r9,0
      	c0005788:	40 82 ff e4 	bne     c000576c <__do_irq+0x6c>
      
      Behind the fact of now using a direct 'bl' instead of a
      'load/mtctr/bctr' sequence, we can also see that we get one less
      register on the stack.
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/6ec2a7865ed6a5ec54ab46d026785bafe1d837ea.1630484892.git.christophe.leroy@csgroup.eu
      5c810ced
    • C
      powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC · 566af8cd
      Christophe Leroy 提交于
      Commit e65e1fc2 ("[PATCH] syscall class hookup for all normal
      targets") added generic support for AUDIT but that didn't include
      support for bi-arch like powerpc.
      
      Commit 4b588411 ("audit: Add generic compat syscall support")
      added generic support for bi-arch.
      
      Convert powerpc to that bi-arch generic audit support.
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/a4b3951d1191d4183d92a07a6097566bde60d00a.1629812058.git.christophe.leroy@csgroup.eu
      566af8cd
    • C
      powerpc/fsl_booke: Enable STRICT_KERNEL_RWX · 49e3d8ea
      Christophe Leroy 提交于
      Enable STRICT_KERNEL_RWX on fsl_booke.
      
      For that, we need additional TLBCAMs dedicated to linear mapping,
      based on the alignment of _sinittext.
      
      By default, up to 768 Mbytes of memory are mapped.
      It uses 3 TLBCAMs of size 256 Mbytes.
      
      With a data alignment of 16, we need up to 9 TLBCAMs:
        16/16/16/16/64/64/64/256/256
      
      With a data alignment of 4, we need up to 12 TLBCAMs:
        4/4/4/4/16/16/16/64/64/64/256/256
      
      With a data alignment of 1, we need up to 15 TLBCAMs:
        1/1/1/1/4/4/4/16/16/16/64/64/64/256/256
      
      By default, set a 16 Mbytes alignment as a compromise between memory
      usage and number of TLBCAMs. This can be adjusted manually when needed.
      
      For the time being, it doens't work when the base is randomised.
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/29f9e5d2bbbc83ae9ca879265426a6278bf4d5bb.1634292136.git.christophe.leroy@csgroup.eu
      49e3d8ea
    • C
      powerpc/booke: Disable STRICT_KERNEL_RWX, DEBUG_PAGEALLOC and KFENCE · 68b44f94
      Christophe Leroy 提交于
      fsl_booke and 44x are not able to map kernel linear memory with
      pages, so they can't support DEBUG_PAGEALLOC and KFENCE, and
      STRICT_KERNEL_RWX is also a problem for now.
      
      Enable those only on book3s (both 32 and 64 except KFENCE), 8xx and 40x.
      
      Fixes: 88df6e90 ("[POWERPC] DEBUG_PAGEALLOC for 32-bit")
      Fixes: 95902e6c ("powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32")
      Fixes: 90cbac0e ("powerpc: Enable KFENCE for PPC32")
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/d1ad9fdd9b27da3fdfa16510bb542ed51fa6e134.1634292136.git.christophe.leroy@csgroup.eu
      68b44f94
  4. 25 8月, 2021 1 次提交
  5. 16 8月, 2021 1 次提交
  6. 30 7月, 2021 2 次提交
  7. 01 7月, 2021 2 次提交
  8. 30 6月, 2021 1 次提交
  9. 21 6月, 2021 3 次提交
  10. 26 5月, 2021 2 次提交
  11. 23 5月, 2021 1 次提交
  12. 06 5月, 2021 3 次提交
  13. 04 5月, 2021 2 次提交
  14. 01 5月, 2021 1 次提交
  15. 21 4月, 2021 1 次提交
  16. 14 4月, 2021 2 次提交
  17. 08 4月, 2021 1 次提交
  18. 03 4月, 2021 3 次提交
  19. 31 3月, 2021 1 次提交
  20. 29 3月, 2021 3 次提交
  21. 24 3月, 2021 1 次提交