1. 19 6月, 2017 4 次提交
  2. 15 6月, 2017 2 次提交
    • N
      powerpc/64s: Avoid cpabort in context switch when possible · 07d2a628
      Nicholas Piggin 提交于
      The ISA v3.0B copy-paste facility only requires cpabort when switching
      to a process that has foreign real addresses mapped (direct access to
      accelerators), to clear a potential copy buffer filled by a previous
      thread. There is no accelerator driver implemented yet, so cpabort can
      be removed. It can be be re-added when a driver is implemented.
      
      POWER9 DD1 requires the copy buffer to always be cleared on context
      switch, but if accelerators are not in use, then an unpaired copy from
      a dummy region is sufficient to clear data out of the copy buffer.
      
      This increases context switch performance by about 5% on POWER9.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      07d2a628
    • N
      powerpc/64: Drop explicit hwsync in context switch · 9145effd
      Nicholas Piggin 提交于
      The sync (aka. hwsync, aka. heavyweight sync) in the context switch
      code to prevent MMIO access being reordered from the point of view of
      a single process if it gets migrated to a different CPU is not
      required because there is an hwsync performed earlier in the context
      switch path.
      
      Comment this so it's clear enough if anything changes on the scheduler
      or the powerpc sides. Remove the hwsync from _switch.
      
      This improves context switch performance by 2-3% on POWER8.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9145effd
  3. 06 6月, 2017 1 次提交
    • N
      powerpc/64s: Machine check handle ifetch from foreign real address for POWER9 · 90df4bfb
      Nicholas Piggin 提交于
      The i-side 0111b machine check, which is "Instruction Fetch to foreign
      address space", was missed by 7b9f71f9 ("powerpc/64s: POWER9 machine
      check handler").
      
          The POWER9 processor core considers host real addresses with a
          nonzero value in RA(8:12) as foreign address space, accessible only
          by the copy and paste instructions. The copy and paste instruction
          pair can be used to invoke the Nest accelerators via the Virtual
          Accelerator Switchboard (VAS).
      
      It is an error for any regular load/store or ifetch to go to a foreign
      addresses. When relocation is on, this causes an MMU exception. When
      relocation is off, a machine check exception. It is possible to trigger
      this machine check by branching to a foreign address with MSR[IR]=0.
      
      Fixes: 7b9f71f9 ("powerpc/64s: POWER9 machine check handler")
      Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      90df4bfb
  4. 05 6月, 2017 3 次提交
  5. 02 6月, 2017 5 次提交
    • H
      powerpc/fadump: Set an upper limit for boot memory size · 48a316e3
      Hari Bathini 提交于
      By default, 5% of system RAM is reserved for preserving boot memory.
      Alternatively, a user can specify the amount of memory to reserve.
      See Documentation/powerpc/firmware-assisted-dump.txt for details. In
      addition to the memory reserved for preserving boot memory, some more
      memory is reserved, to save HPTE region, CPU state data and ELF core
      headers.
      
      Memory Reservation during first kernel looks like below:
      
        Low memory                                        Top of memory
        0      boot memory size                                       |
        |           |                       |<--Reserved dump area -->|
        V           V                       |   Permanent Reservation V
        +-----------+----------/ /----------+---+----+-----------+----+
        |           |                       |CPU|HPTE|  DUMP     |ELF |
        +-----------+----------/ /----------+---+----+-----------+----+
              |                                           ^
              |                                           |
              \                                           /
               -------------------------------------------
                Boot memory content gets transferred to
                reserved area by firmware at the time of
                crash
      
      This implicitly means that the sum of the sizes of boot memory, CPU
      state data, HPTE region, DUMP preserving area and ELF core headers
      can't be greater than the total memory size. But currently, a user is
      allowed to specify any value as boot memory size. So, the above rule
      is violated when a boot memory size around 50% of the total available
      memory is specified. As the kernel is not handling this currently, it
      may lead to undefined behavior. Fix it by setting an upper limit for
      boot memory size to 25% of the total available memory. Also, instead
      of using memblock_end_of_DRAM(), which doesn't take the holes, if any,
      in the memory layout into account, use memblock_phys_mem_size() to
      calculate the percentage of total available memory.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      48a316e3
    • C
      powerpc: Remove __ilog2()s and use generic ones · f782ddf2
      Christophe Leroy 提交于
      With the __ilog2() function as defined in
      arch/powerpc/include/asm/bitops.h, GCC will not optimise the code
      in case of constant parameter.
      
      The generic ilog2() function in include/linux/log2.h is written
      to handle the case of the constant parameter.
      
      This patch discards the three __ilog2() functions and
      defines __ilog2() as ilog2()
      
      For non constant calls, the generated code is doing the same:
      int test__ilog2(unsigned long x)
      {
      	return __ilog2(x);
      }
      
      int test__ilog2_u32(u32 n)
      {
      	return __ilog2_u32(n);
      }
      
      int test__ilog2_u64(u64 n)
      {
      	return __ilog2_u64(n);
      }
      
      On PPC32 before the patch:
      00000000 <test__ilog2>:
         0:	7c 63 00 34 	cntlzw  r3,r3
         4:	20 63 00 1f 	subfic  r3,r3,31
         8:	4e 80 00 20 	blr
      
      0000000c <test__ilog2_u32>:
         c:	7c 63 00 34 	cntlzw  r3,r3
        10:	20 63 00 1f 	subfic  r3,r3,31
        14:	4e 80 00 20 	blr
      
      On PPC32 after the patch:
      00000000 <test__ilog2>:
         0:	7c 63 00 34 	cntlzw  r3,r3
         4:	20 63 00 1f 	subfic  r3,r3,31
         8:	4e 80 00 20 	blr
      
      0000000c <test__ilog2_u32>:
         c:	7c 63 00 34 	cntlzw  r3,r3
        10:	20 63 00 1f 	subfic  r3,r3,31
        14:	4e 80 00 20 	blr
      
      On PPC64 before the patch:
      0000000000000000 <.test__ilog2>:
         0:	7c 63 00 74 	cntlzd  r3,r3
         4:	20 63 00 3f 	subfic  r3,r3,63
         8:	7c 63 07 b4 	extsw   r3,r3
         c:	4e 80 00 20 	blr
      
      0000000000000010 <.test__ilog2_u32>:
        10:	7c 63 00 34 	cntlzw  r3,r3
        14:	20 63 00 1f 	subfic  r3,r3,31
        18:	7c 63 07 b4 	extsw   r3,r3
        1c:	4e 80 00 20 	blr
      
      0000000000000020 <.test__ilog2_u64>:
        20:	7c 63 00 74 	cntlzd  r3,r3
        24:	20 63 00 3f 	subfic  r3,r3,63
        28:	7c 63 07 b4 	extsw   r3,r3
        2c:	4e 80 00 20 	blr
      
      On PPC64 after the patch:
      0000000000000000 <.test__ilog2>:
         0:	7c 63 00 74 	cntlzd  r3,r3
         4:	20 63 00 3f 	subfic  r3,r3,63
         8:	7c 63 07 b4 	extsw   r3,r3
         c:	4e 80 00 20 	blr
      
      0000000000000010 <.test__ilog2_u32>:
        10:	7c 63 00 34 	cntlzw  r3,r3
        14:	20 63 00 1f 	subfic  r3,r3,31
        18:	7c 63 07 b4 	extsw   r3,r3
        1c:	4e 80 00 20 	blr
      
      0000000000000020 <.test__ilog2_u64>:
        20:	7c 63 00 74 	cntlzd  r3,r3
        24:	20 63 00 3f 	subfic  r3,r3,63
        28:	7c 63 07 b4 	extsw   r3,r3
        2c:	4e 80 00 20 	blr
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f782ddf2
    • C
      powerpc: Replace ffz() by equivalent generic function · 22ef33b3
      Christophe Leroy 提交于
      With the ffz() function as defined in arch/powerpc/include/asm/bitops.h
      GCC will not optimise the code in case of constant parameter.
      
      This patch replaces ffz() by the generic function.
      
      The generic ffz(x) expects to never be called with ~x == 0
      as written in the comment in include/asm-generic/bitops/ffz.h
      The only user of ffz() within arch/powerpc/ is
      platforms/512x/mpc5121_ads_cpld.c, which checks if x is not 0xff
      
      For non constant calls, the generated code is doing the same:
      
      unsigned long testffz(unsigned long x)
      {
      	return ffz(x);
      }
      
      On PPC32, before the patch:
      00000018 <testffz>:
        18:	7c 63 18 f9 	not.    r3,r3
        1c:	40 82 00 0c 	bne     28 <testffz+0x10>
        20:	38 60 00 20 	li      r3,32
        24:	4e 80 00 20 	blr
        28:	7d 23 00 d0 	neg     r9,r3
        2c:	7d 23 18 38 	and     r3,r9,r3
        30:	7c 63 00 34 	cntlzw  r3,r3
        34:	20 63 00 1f 	subfic  r3,r3,31
        38:	4e 80 00 20 	blr
      
      On PPC32, after the patch:
      00000018 <testffz>:
        18:	39 23 00 01 	addi    r9,r3,1
        1c:	7d 23 18 78 	andc    r3,r9,r3
        20:	7c 63 00 34 	cntlzw  r3,r3
        24:	20 63 00 1f 	subfic  r3,r3,31
        28:	4e 80 00 20 	blr
      
      On PPC64, before the patch:
      0000000000000030 <.testffz>:
        30:	7c 60 18 f9 	not.    r0,r3
        34:	38 60 00 40 	li      r3,64
        38:	4d 82 00 20 	beqlr
        3c:	7c 60 00 d0 	neg     r3,r0
        40:	7c 63 00 38 	and     r3,r3,r0
        44:	7c 63 00 74 	cntlzd  r3,r3
        48:	20 63 00 3f 	subfic  r3,r3,63
        4c:	7c 63 07 b4 	extsw   r3,r3
        50:	4e 80 00 20 	blr
      
      On PPC64, after the patch:
      0000000000000030 <.testffz>:
        30:	38 03 00 01 	addi    r0,r3,1
        34:	7c 03 18 78 	andc    r3,r0,r3
        38:	7c 63 00 74 	cntlzd  r3,r3
        3c:	20 63 00 3f 	subfic  r3,r3,63
        40:	4e 80 00 20 	blr
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      22ef33b3
    • C
      powerpc: Use builtin functions for fls()/__fls()/fls64() · 2fcff790
      Christophe Leroy 提交于
      With the fls() functions as defined in arch/powerpc/include/asm/bitops.h
      GCC will not optimise the code in case of constant parameter.
      
      This patch replaces __fls() by the builtin function, and modifies
      fls() and fls64() to use builtins instead of inline assembly
      
      For non constant calls, the generated code is doing the same:
      
      int testfls(unsigned int x)
      {
      	return fls(x);
      }
      
      unsigned long test__fls(unsigned long x)
      {
      	return __fls(x);
      }
      
      int testfls64(__u64 x)
      {
      	return fls64(x);
      }
      
      On PPC32, before the patch:
      00000064 <testfls>:
        64:	7c 63 00 34 	cntlzw  r3,r3
        68:	20 63 00 20 	subfic  r3,r3,32
        6c:	4e 80 00 20 	blr
      
      00000070 <test__fls>:
        70:	7c 63 00 34 	cntlzw  r3,r3
        74:	20 63 00 1f 	subfic  r3,r3,31
        78:	4e 80 00 20 	blr
      
      0000007c <testfls64>:
        7c:	2c 03 00 00 	cmpwi   r3,0
        80:	40 82 00 10 	bne     90 <testfls64+0x14>
        84:	7c 83 00 34 	cntlzw  r3,r4
        88:	20 63 00 20 	subfic  r3,r3,32
        8c:	4e 80 00 20 	blr
        90:	7c 63 00 34 	cntlzw  r3,r3
        94:	20 63 00 40 	subfic  r3,r3,64
        98:	4e 80 00 20 	blr
      
      On PPC32, after the patch:
      00000054 <testfls>:
        54:	7c 63 00 34 	cntlzw  r3,r3
        58:	20 63 00 20 	subfic  r3,r3,32
        5c:	4e 80 00 20 	blr
      
      00000060 <test__fls>:
        60:	7c 63 00 34 	cntlzw  r3,r3
        64:	20 63 00 1f 	subfic  r3,r3,31
        68:	4e 80 00 20 	blr
      
      0000006c <testfls64>:
        6c:	2c 03 00 00 	cmpwi   r3,0
        70:	41 82 00 10 	beq     80 <testfls64+0x14>
        74:	7c 63 00 34 	cntlzw  r3,r3
        78:	20 63 00 40 	subfic  r3,r3,64
        7c:	4e 80 00 20 	blr
        80:	7c 83 00 34 	cntlzw  r3,r4
        84:	20 63 00 40 	subfic  r3,r3,32
        88:	4e 80 00 20 	blr
      
      On PPC64, before the patch:
      00000000000000a0 <.testfls>:
        a0:	7c 63 00 34 	cntlzw  r3,r3
        a4:	20 63 00 20 	subfic  r3,r3,32
        a8:	7c 63 07 b4 	extsw   r3,r3
        ac:	4e 80 00 20 	blr
      
      00000000000000b0 <.test__fls>:
        b0:	7c 63 00 74 	cntlzd  r3,r3
        b4:	20 63 00 3f 	subfic  r3,r3,63
        b8:	7c 63 07 b4 	extsw   r3,r3
        bc:	4e 80 00 20 	blr
      
      00000000000000c0 <.testfls64>:
        c0:	7c 63 00 74 	cntlzd  r3,r3
        c4:	20 63 00 40 	subfic  r3,r3,64
        c8:	7c 63 07 b4 	extsw   r3,r3
        cc:	4e 80 00 20 	blr
      
      On PPC64, after the patch:
      0000000000000090 <.testfls>:
        90:	7c 63 00 34 	cntlzw  r3,r3
        94:	20 63 00 20 	subfic  r3,r3,32
        98:	7c 63 07 b4 	extsw   r3,r3
        9c:	4e 80 00 20 	blr
      
      00000000000000a0 <.test__fls>:
        a0:	7c 63 00 74 	cntlzd  r3,r3
        a4:	20 63 00 3f 	subfic  r3,r3,63
        a8:	4e 80 00 20 	blr
        ac:	60 00 00 00 	nop
      
      00000000000000b0 <.testfls64>:
        b0:	7c 63 00 74 	cntlzd  r3,r3
        b4:	20 63 00 40 	subfic  r3,r3,64
        b8:	7c 63 07 b4 	extsw   r3,r3
        bc:	4e 80 00 20 	blr
      
      Those builtins have been in GCC since at least 3.4.6 (see
      https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Other-Builtins.html )
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2fcff790
    • C
      powerpc: Discard ffs()/__ffs() function and use builtin functions instead · f83647d6
      Christophe Leroy 提交于
      With the ffs() function as defined in arch/powerpc/include/asm/bitops.h
      GCC will not optimise the code in case of constant parameter, as shown
      by the small exemple below.
      
      int ffs_test(void)
      {
      	return 4 << ffs(31);
      }
      
      c0012334 <ffs_test>:
      c0012334:       39 20 00 01     li      r9,1
      c0012338:       38 60 00 04     li      r3,4
      c001233c:       7d 29 00 34     cntlzw  r9,r9
      c0012340:       21 29 00 20     subfic  r9,r9,32
      c0012344:       7c 63 48 30     slw     r3,r3,r9
      c0012348:       4e 80 00 20     blr
      
      With this patch, the same function will compile as follows:
      
      c0012334 <ffs_test>:
      c0012334:       38 60 00 08     li      r3,8
      c0012338:       4e 80 00 20     blr
      
      The same happens with __ffs()
      
      For non constant calls, the generated code is doing the same,
      allthought it is slightly different on 64 bits for ffs():
      
      unsigned long test__ffs(unsigned long x)
      {
      	return __ffs(x);
      }
      
      int testffs(int x)
      {
      	return ffs(x);
      }
      
      On PPC32, before the patch:
      0000003c <test__ffs>:
        3c:	7d 23 00 d0 	neg     r9,r3
        40:	7d 23 18 38 	and     r3,r9,r3
        44:	7c 63 00 34 	cntlzw  r3,r3
        48:	20 63 00 1f 	subfic  r3,r3,31
        4c:	4e 80 00 20 	blr
      
      00000050 <testffs>:
        50:	7d 23 00 d0 	neg     r9,r3
        54:	7d 23 18 38 	and     r3,r9,r3
        58:	7c 63 00 34 	cntlzw  r3,r3
        5c:	20 63 00 20 	subfic  r3,r3,32
        60:	4e 80 00 20 	blr
      
      On PPC32, after the patch:
      0000002c <test__ffs>:
        2c:	7d 23 00 d0 	neg     r9,r3
        30:	7d 23 18 38 	and     r3,r9,r3
        34:	7c 63 00 34 	cntlzw  r3,r3
        38:	20 63 00 1f 	subfic  r3,r3,31
        3c:	4e 80 00 20 	blr
      
      00000040 <testffs>:
        40:	7d 23 00 d0 	neg     r9,r3
        44:	7d 23 18 38 	and     r3,r9,r3
        48:	7c 63 00 34 	cntlzw  r3,r3
        4c:	20 63 00 20 	subfic  r3,r3,32
        50:	4e 80 00 20 	blr
      
      On PPC64, before the patch:
      0000000000000060 <.test__ffs>:
        60:	7c 03 00 d0 	neg     r0,r3
        64:	7c 03 18 38 	and     r3,r0,r3
        68:	7c 63 00 74 	cntlzd  r3,r3
        6c:	20 63 00 3f 	subfic  r3,r3,63
        70:	7c 63 07 b4 	extsw   r3,r3
        74:	4e 80 00 20 	blr
      
      0000000000000080 <.testffs>:
        80:	7c 03 00 d0 	neg     r0,r3
        84:	7c 03 18 38 	and     r3,r0,r3
        88:	7c 63 00 74 	cntlzd  r3,r3
        8c:	20 63 00 40 	subfic  r3,r3,64
        90:	7c 63 07 b4 	extsw   r3,r3
        94:	4e 80 00 20 	blr
      
      On PPC64, after the patch:
      0000000000000050 <.test__ffs>:
        50:	7c 03 00 d0 	neg     r0,r3
        54:	7c 03 18 38 	and     r3,r0,r3
        58:	7c 63 00 74 	cntlzd  r3,r3
        5c:	20 63 00 3f 	subfic  r3,r3,63
        60:	4e 80 00 20 	blr
      
      0000000000000070 <.testffs>:
        70:	7c 03 00 d0 	neg     r0,r3
        74:	7c 03 18 38 	and     r3,r0,r3
        78:	7c 63 00 34 	cntlzw  r3,r3
        7c:	20 63 00 20 	subfic  r3,r3,32
        80:	7c 63 07 b4 	extsw   r3,r3
        84:	4e 80 00 20 	blr
      (ffs() operates on an int so cntlzw is equivalent to cntlzd)
      
      In addition, when reading the generated vmlinux, we can observe
      that with the builtin functions, GCC sometimes efficiently spreads
      the instructions within the generated functions while the inline
      assembly force them to remain grouped together.
      
      __builtin_ffs() is already used in arch/powerpc/include/asm/page_32.h
      
      Those builtins have been in GCC since at least 3.4.6 (see
      https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Other-Builtins.html )
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f83647d6
  6. 30 5月, 2017 4 次提交
    • N
      powerpc/64: Tool to check head sections location sanity · c494adef
      Nicholas Piggin 提交于
      Use a tool to check that the location of "fixed sections" are where
      we expected them to be, which catches cases the linker script can't
      (stubs being added to start of .text section), and which ends up
      being neater.
      
      Sample output:
      
        ERROR: start_text address is c000000000008100, should be c000000000008000
        ERROR: see comments in arch/powerpc/tools/head_check.sh
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Fold in fix from Nick for 4.6 era toolchains]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c494adef
    • N
      powerpc/64: Handle linker stubs in low .text code · 951eedeb
      Nicholas Piggin 提交于
      Very large kernels may require linker stubs for branches from HEAD
      text code. The linker may place these stubs before the HEAD text
      sections, which breaks the assumption that HEAD text is located at 0
      (or the .text section being located at 0x7000/0x8000 on Book3S
      kernels).
      
      Provide an option to create a small section just before the .text
      section with an empty 256 - 4 bytes, and adjust the start of the .text
      section to match. The linker will tend to put stubs in that section
      and not break our relative-to-absolute offset assumptions.
      
      This causes a small waste of space on common kernels, but allows large
      kernels to build and boot. For now, it is an EXPERT config option,
      defaulting to =n, but a reference is provided for it in the build-time
      check for such breakage. This is good enough for allyesconfig and
      custom users / hackers.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      951eedeb
    • G
      powerpc/powernv/idle: Use Requested Level for restoring state on P9 DD1 · 22c6663d
      Gautham R. Shenoy 提交于
      On Power9 DD1 due to a hardware bug the Power-Saving Level Status
      field (PLS) of the PSSCR for a thread waking up from a deep state can
      under-report if some other thread in the core is in a shallow stop
      state. The scenario in which this can manifest is as follows:
      
         1) All the threads of the core are in deep stop.
         2) One of the threads is woken up. The PLS for this thread will
            correctly reflect that it is waking up from deep stop.
         3) The thread that has woken up now executes a shallow stop.
         4) When some other thread in the core is woken, its PLS will reflect
            the shallow stop state.
      
      Thus, the subsequent thread for which the PLS is under-reporting the
      wakeup state will not restore the hypervisor resources.
      
      Hence, on DD1 systems, use the Requested Level (RL) field as a
      workaround to restore the contents of the hypervisor resources on the
      wakeup from the stop state.
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      22c6663d
    • N
      powerpc/64s: Fix FIXUP_ENDIAN non-maskable interrupt reentrancy · f1fe5252
      Nicholas Piggin 提交于
      FIXUP_ENDIAN uses SRR[01] with MSR_RI=1, which gets corrupted if there
      is an interleaving system reset or machine check interrupt.
      
      Set MSR_RI=0 before setting SRRs. The rfid will restore MSR.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f1fe5252
  7. 19 5月, 2017 1 次提交
    • M
      powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hash · e41e53cd
      Michael Ellerman 提交于
      virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on
      an address. What this means in practice is that it should only return true for
      addresses in the linear mapping which are backed by a valid PFN.
      
      We are failing to properly check that the address is in the linear mapping,
      because virt_to_pfn() will return a valid looking PFN for more or less any
      address. That bug is actually caused by __pa(), used in virt_to_pfn().
      
      eg: __pa(0xc000000000010000) = 0x10000  # Good
          __pa(0xd000000000010000) = 0x10000  # Bad!
          __pa(0x0000000000010000) = 0x10000  # Bad!
      
      This started happening after commit bdbc29c1 ("powerpc: Work around gcc
      miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition
      of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from
      the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give
      you something bogus back.
      
      Until we can verify if that GCC bug is no longer an issue, or come up with
      another solution, this commit does the minimal fix to make virt_addr_valid()
      work, by explicitly checking that the address is in the linear mapping region.
      
      Fixes: bdbc29c1 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Tested-by: NBreno Leitao <breno.leitao@gmail.com>
      e41e53cd
  8. 15 5月, 2017 1 次提交
    • M
      powerpc/modules: If mprofile-kernel is enabled add it to vermagic · 43e24e82
      Michael Ellerman 提交于
      On powerpc we can build the kernel with two different ABIs for mcount(), which
      is used by ftrace. Kernels built with one ABI do not know how to load modules
      built with the other ABI. The new style ABI is called "mprofile-kernel", for
      want of a better name.
      
      Currently if we build a module using the old style ABI, and the kernel with
      mprofile-kernel, when we load the module we'll oops something like:
      
        # insmod autofs4-no-mprofile-kernel.ko
        ftrace-powerpc: Unexpected instruction f8810028 around bl _mcount
        ------------[ cut here ]------------
        WARNING: CPU: 6 PID: 3759 at ../kernel/trace/ftrace.c:2024 ftrace_bug+0x2b8/0x3c0
        CPU: 6 PID: 3759 Comm: insmod Not tainted 4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74 #11
        ...
        NIP [c0000000001eaa48] ftrace_bug+0x2b8/0x3c0
        LR [c0000000001eaff8] ftrace_process_locs+0x4a8/0x590
        Call Trace:
          alloc_pages_current+0xc4/0x1d0 (unreliable)
          ftrace_process_locs+0x4a8/0x590
          load_module+0x1c8c/0x28f0
          SyS_finit_module+0x110/0x140
          system_call+0x38/0xfc
        ...
        ftrace failed to modify
        [<d000000002a31024>] 0xd000000002a31024
         actual:   35:65:00:48
      
      We can avoid this by including in the vermagic whether the kernel/module was
      built with mprofile-kernel. Which results in:
      
        # insmod autofs4-pg.ko
        autofs4: version magic
        '4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74 SMP mod_unload modversions '
        should be
        '4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74-dirty SMP mod_unload modversions mprofile-kernel'
        insmod: ERROR: could not insert module autofs4-pg.ko: Invalid module format
      
      Fixes: 8c50b72a ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Acked-by: NJessica Yu <jeyu@redhat.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      43e24e82
  9. 09 5月, 2017 3 次提交
  10. 05 5月, 2017 1 次提交
    • S
      powerpc/64e: Don't place the stack beyond TASK_SIZE · 61baf155
      Scott Wood 提交于
      Commit f4ea6dcb ("powerpc/mm: Enable mappings above 128TB") increased
      the task size on book3s, and introduced a mechanism to dynamically
      control whether a task uses these larger addresses.  While the change to
      the task size itself was ifdef-protected to only apply on book3s, the
      change to STACK_TOP_USER64 was not.  On book3e, this had the effect of
      trying to use addresses up to 128TiB for the stack despite a 64TiB task
      size limit -- which broke 64-bit userspace producing the following errors:
      
      Starting init: /sbin/init exists but couldn't execute it (error -14)
      Starting init: /bin/sh exists but couldn't execute it (error -14)
      Kernel panic - not syncing: No working init found.  Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
      
      Fixes: f4ea6dcb ("powerpc/mm: Enable mappings above 128TB")
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NScott Wood <oss@buserror.net>
      61baf155
  11. 03 5月, 2017 1 次提交
    • C
      powerpc/8xx: Adding support of IRQ in MPC8xx GPIO · 726bd223
      Christophe Leroy 提交于
      This patch allows the use of IRQ to notify the change of GPIO status
      on MPC8xx CPM IO ports. This then allows to associate IRQs to GPIOs
      in the Device Tree.
      
      Ex:
      	CPM1_PIO_C: gpio-controller@960 {
      		#gpio-cells = <2>;
      		compatible = "fsl,cpm1-pario-bank-c";
      		reg = <0x960 0x10>;
      		fsl,cpm1-gpio-irq-mask = <0x0fff>;
      		interrupts = <1 2 6 9 10 11 14 15 23 24 26 31>;
      		interrupt-parent = <&CPM_PIC>;
      		gpio-controller;
      	};
      
      The property 'fsl,cpm1-gpio-irq-mask' defines which of the 16 GPIOs
      have the associated interrupts defined in the 'interrupts' property.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      726bd223
  12. 28 4月, 2017 8 次提交
  13. 27 4月, 2017 2 次提交
  14. 24 4月, 2017 3 次提交
    • D
      powerpc/mm: Ensure IRQs are off in switch_mm() · 9765ad13
      David Gibson 提交于
      powerpc expects IRQs to already be (soft) disabled when switch_mm() is
      called, as made clear in the commit message of 9c1e1052 ("powerpc: Allow
      perf_counters to access user memory at interrupt time").
      
      Aside from any race conditions that might exist between switch_mm() and an IRQ,
      there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't
      followed at some point by an IRQ enable then interrupts will remain disabled
      until we return to userspace.
      
      It is true that when switch_mm() is called from the scheduler IRQs are off, but
      not when it's called by use_mm(). Looking closer we see that last year in commit
      f98db601 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler")
      this was made more explicit by the addition of switch_mm_irqs_off() which is now
      called by the scheduler, vs switch_mm() which is used by use_mm().
      
      Arguably it is a bug in use_mm() to call switch_mm() in a different context than
      it expects, but fixing that will take time.
      
      This was discovered recently when vhost started throwing warnings such as:
      
        BUG: sleeping function called from invalid context at kernel/mutex.c:578
        in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760
        no locks held by vhost-10760/10768.
        irq event stamp: 10
        hardirqs last  enabled at (9):  _raw_spin_unlock_irq+0x40/0x80
        hardirqs last disabled at (10): switch_slb+0x2e4/0x490
        softirqs last  enabled at (0):  copy_process+0x5e8/0x1260
        softirqs last disabled at (0):  (null)
        Call Trace:
          show_stack+0x88/0x390 (unreliable)
          dump_stack+0x30/0x44
          __might_sleep+0x1c4/0x2d0
          mutex_lock_nested+0x74/0x5c0
          cgroup_attach_task_all+0x5c/0x180
          vhost_attach_cgroups_work+0x58/0x80 [vhost]
          vhost_worker+0x24c/0x3d0 [vhost]
          kthread+0xec/0x100
          ret_from_kernel_thread+0x5c/0xd4
      
      Prior to commit 04b96e55 ("vhost: lockless enqueuing") (Aug 2016) the
      vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(),
      which had the effect of reenabling IRQs. Since that commit removed the locking
      in vhost_worker() the body of the vhost_worker() loop now runs with interrupts
      off causing the warnings.
      
      This patch addresses the problem by making the powerpc code mirror the x86 code,
      ie. we disable interrupts in switch_mm(), and optimise the scheduler case by
      defining switch_mm_irqs_off().
      
      Cc: stable@vger.kernel.org # v4.7+
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Flesh out/rewrite change log, add stable]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9765ad13
    • N
      powerpc: Introduce a new helper to obtain function entry points · 1b32cd17
      Naveen N. Rao 提交于
      kprobe_lookup_name() is specific to the kprobe subsystem and may not always
      return the function entry point (in a subsequent patch for KPROBES_ON_FTRACE).
      For looking up function entry points, introduce a separate helper and use it
      in optprobes.c
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1b32cd17
    • N
      powerpc/kprobes: Add support for KPROBES_ON_FTRACE · ead514d5
      Naveen N. Rao 提交于
      Allow kprobes to be placed on ftrace _mcount() call sites. This optimization
      avoids the use of a trap, by riding on ftrace infrastructure.
      
      This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on MPROFILE_KERNEL,
      which is only currently enabled on powerpc64le with newer toolchains.
      
      Based on the x86 code by Masami.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ead514d5
  15. 23 4月, 2017 1 次提交