1. 27 3月, 2018 6 次提交
  2. 23 3月, 2018 1 次提交
    • P
      powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a
      Paul Mackerras 提交于
      POWER9 processors up to and including "Nimbus" v2.2 have hardware
      bugs relating to transactional memory and thread reconfiguration.
      One of these bugs has a workaround which is to get the core into
      SMT4 state temporarily.  This workaround is only needed when
      running bare-metal.
      
      This patch provides a function which gets the core into SMT4 mode
      by preventing threads from going to a stop state, and waking up
      those which are already in a stop state.  Once at least 3 threads
      are not in a stop state, the core will be in SMT4 and we can
      continue.
      
      To do this, we add a "dont_stop" flag to the paca to tell the
      thread not to go into a stop state.  If this flag is set,
      power9_idle_stop() just returns immediately with a return value
      of 0.  The pnv_power9_force_smt4_catch() function does the following:
      
      1. Set the dont_stop flag for each thread in the core, except
         ourselves (in fact we use an atomic_inc() in case more than
         one thread is calling this function concurrently).
      2. See how many threads are awake, indicated by their
         requested_psscr field in the paca being 0.  If this is at
         least 3, skip to step 5.
      3. Send a doorbell interrupt to each thread that was seen as
         being in a stop state in step 2.
      4. Until at least 3 threads are awake, scan the threads to which
         we sent a doorbell interrupt and check if they are awake now.
      
      This relies on the following properties:
      
      - Once dont_stop is non-zero, requested_psccr can't go from zero to
        non-zero, except transiently (and without the thread doing stop).
      - requested_psscr being zero guarantees that the thread isn't in
        a state-losing stop state where thread reconfiguration could occur.
      - Doing stop with a PSSCR value of 0 won't be a state-losing stop
        and thus won't allow thread reconfiguration.
      - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
        must be in SMT4 mode, since SMT modes are powers of 2.
      
      This does add a sync to power9_idle_stop(), which is necessary to
      provide the correct ordering between setting requested_psscr and
      checking dont_stop.  The overhead of the sync should be unnoticeable
      compared to the latency of going into and out of a stop state.
      
      Because some objected to incurring this extra latency on systems where
      the XER[SO] bug is not relevant, I have put the test in
      power9_idle_stop inside a feature section.  This means that
      pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
      without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
      probably hang the system.
      
      In order to cater for uses where the caller has an operation that
      has to be done while the core is in SMT4, the core continues to be
      kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
      until the pnv_power9_force_smt4_release() function is called.
      It undoes the effect of step 1 above and allows the other threads
      to go into a stop state.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7672691a
  3. 20 3月, 2018 1 次提交
  4. 14 3月, 2018 3 次提交
  5. 13 3月, 2018 9 次提交
  6. 06 3月, 2018 1 次提交
    • C
      powerpc/mm/slice: Fix hugepage allocation at hint address on 8xx · aa0ab02b
      Christophe Leroy 提交于
      On the 8xx, the page size is set in the PMD entry and applies to
      all pages of the page table pointed by the said PMD entry.
      
      When an app has some regular pages allocated (e.g. see below) and tries
      to mmap() a huge page at a hint address covered by the same PMD entry,
      the kernel accepts the hint allthough the 8xx cannot handle different
      page sizes in the same PMD entry.
      
      10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc
      10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc
      
      mmap(0x10080000, 524288, PROT_READ|PROT_WRITE,
           MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000
      
      This results the app remaining forever in do_page_fault()/hugetlb_fault()
      and when interrupting that app, we get the following warning:
      
      [162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4
      [162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W       4.14.6 #85
      [162980.035744] task: c67e2c00 task.stack: c668e000
      [162980.035783] NIP:  c000fe18 LR: c00e1eec CTR: c00f90c0
      [162980.035830] REGS: c668fc20 TRAP: 0700   Tainted: G W        (4.14.6)
      [162980.035854] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24044224 XER: 20000000
      [162980.036003]
      [162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000
      [162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc
      [162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48
      [162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000
      [162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4
      [162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150
      [162980.036861] Call Trace:
      [162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable)
      [162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150
      [162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4
      [162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8
      [162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c
      [162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc
      [162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614
      [162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244
      [162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88
      [162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4
      [162980.037781] Instruction dump:
      [162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94
      [162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094
      [162980.038216] ---[ end trace c0ceeca8e7a5800a ]---
      [162980.038754] BUG: non-zero nr_ptes on freeing mm: 1
      [162985.363322] BUG: non-zero nr_ptes on freeing mm: -1
      
      In order to fix this, this patch uses the address space "slices"
      implemented for BOOK3S/64 and enhanced to support PPC32 by the
      preceding patch.
      
      This patch modifies the context.id on the 8xx to be in the range
      [1:16] instead of [0:15] in order to identify context.id == 0 as
      not initialised contexts as done on BOOK3S
      
      This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is
      selected for the 8xx
      
      Alltough we could in theory have as many slices as PMD entries, the
      current slices implementation limits the number of low slices to 16.
      This limitation is not preventing us to fix the initial issue allthough
      it is suboptimal. It will be cured in a subsequent patch.
      
      Fixes: 4b914286 ("powerpc/8xx: Implement support of hugepages")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aa0ab02b
  7. 23 2月, 2018 2 次提交
  8. 22 2月, 2018 1 次提交
    • I
      treewide/trivial: Remove ';;$' typo noise · ed7158ba
      Ingo Molnar 提交于
      On lkml suggestions were made to split up such trivial typo fixes into per subsystem
      patches:
      
        --- a/arch/x86/boot/compressed/eboot.c
        +++ b/arch/x86/boot/compressed/eboot.c
        @@ -439,7 +439,7 @@ setup_uga32(void **uga_handle, unsigned long size, u32 *width, u32 *height)
                struct efi_uga_draw_protocol *uga = NULL, *first_uga;
                efi_guid_t uga_proto = EFI_UGA_PROTOCOL_GUID;
                unsigned long nr_ugas;
        -       u32 *handles = (u32 *)uga_handle;;
        +       u32 *handles = (u32 *)uga_handle;
                efi_status_t status = EFI_INVALID_PARAMETER;
                int i;
      
      This patch is the result of the following script:
      
        $ sed -i 's/;;$/;/g' $(git grep -E ';;$'  | grep "\.[ch]:"  | grep -vwE 'for|ia64' | cut -d: -f1 | sort | uniq)
      
      ... followed by manual review to make sure it's all good.
      
      Splitting this up is just crazy talk, let's get over with this and just do it.
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ed7158ba
  9. 15 2月, 2018 1 次提交
    • N
      powerpc/powernv: IMC fix out of bounds memory access at shutdown · e7bde88c
      Nicholas Piggin 提交于
      The OPAL IMC driver's shutdown handler disables nest PMU counters by
      walking nodes and taking the first CPU out of their cpumask, which is
      used to index into the paca (get_hard_smp_processor_id()). This does
      not always do the right thing, and in particular for CPU-less nodes it
      returns NR_CPUS and that overruns the paca and dereferences random
      memory.
      
      Fix it by being more careful about checking returned CPU, and only
      using online CPUs. It's not clear this shutdown code makes sense after
      commit 885dcd70 ("powerpc/perf: Add nest IMC PMU support"), but this
      should not make things worse
      
      Currently the bug causes us to call OPAL with a junk CPU number. A
      separate patch in development to change the way pacas are allocated
      escalates this bug into a crash:
      
        Unable to handle kernel paging request for data at address 0x2a21af1eeb000076
        Faulting instruction address: 0xc0000000000a5468
        Oops: Kernel access of bad area, sig: 11 [#1]
        ...
        NIP opal_imc_counters_shutdown+0x148/0x1d0
        LR  opal_imc_counters_shutdown+0x134/0x1d0
        Call Trace:
         opal_imc_counters_shutdown+0x134/0x1d0 (unreliable)
         platform_drv_shutdown+0x44/0x60
         device_shutdown+0x1f8/0x350
         kernel_restart_prepare+0x54/0x70
         kernel_restart+0x28/0xc0
         SyS_reboot+0x1d0/0x2c0
         system_call+0x58/0x6c
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e7bde88c
  10. 13 2月, 2018 3 次提交
    • G
      powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug · 82343484
      Guenter Roeck 提交于
      Commit e67e02a5 ("powerpc/pseries: Fix cpu hotplug crash with
      memoryless nodes") adds an unconditional call to
      find_and_online_cpu_nid(), which is only declared if CONFIG_PPC_SPLPAR
      is enabled. This results in the following build error if this is not
      the case.
      
        arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
        arch/powerpc/platforms/pseries/hotplug-cpu.c:369:
        			undefined reference to `.find_and_online_cpu_nid'
      
      Follow the guideline provided by similar functions and provide a dummy
      function if CONFIG_PPC_SPLPAR is not enabled. This also moves the
      external function declaration into an include file where it should be.
      
      Fixes: e67e02a5 ("powerpc/pseries: Fix cpu hotplug crash with memoryless nodes")
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      [mpe: Change subject to emphasise the build fix]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      82343484
    • N
      powerpc/vas: Don't set uses_vas for kernel windows · b00b6289
      Nicholas Piggin 提交于
      cp_abort is only required for user windows, because kernel context
      must not be preempted between a copy/paste pair.
      
      Without this patch, the init task gets used_vas set when it runs the
      nx842_powernv_init initcall, which opens windows for kernel usage.
      
      used_vas is then never cleared anywhere, so it gets propagated into
      all other tasks. It's a property of the address space, so it should
      really be cleared when a new mm is created (or in dup_mmap if the
      mmaps are marked as VM_DONTCOPY). For now we seem to have no such
      driver, so leave that for another patch.
      
      Fixes: 6c8e6bb2 ("powerpc/vas: Add support for user receive window")
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b00b6289
    • S
      powerpc/pseries: Enable RAS hotplug events later · c9dccf1d
      Sam Bobroff 提交于
      Currently if the kernel receives a memory hot-unplug event early
      enough, it may get stuck in an infinite loop in
      dissolve_free_huge_pages(). This appears as a stall just after:
      
        pseries-hotplug-mem: Attempting to hot-remove XX LMB(s) at YYYYYYYY
      
      It appears to be caused by "minimum_order" being uninitialized, due to
      init_ras_IRQ() executing before hugetlb_init().
      
      To correct this, extract the part of init_ras_IRQ() that enables
      hotplug event processing and place it in the machine_late_initcall
      phase, which is guaranteed to be after hugetlb_init() is called.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      [mpe: Reorder the functions to make the diff readable]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c9dccf1d
  11. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  12. 08 2月, 2018 1 次提交
    • N
      powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove · 1d9a0907
      Nathan Fontenot 提交于
      When DLPAR removing a CPU, the unmapping of the cpu from a node in
      unmap_cpu_from_node() should also invalidate the CPUs entry in the
      numa_cpu_lookup_table. There is not a guarantee that on a subsequent
      DLPAR add of the CPU the associativity will be the same and thus
      could be in a different node. Invalidating the entry in the
      numa_cpu_lookup_table causes the associativity to be read from the
      device tree at the time of the add.
      
      The current behavior of not invalidating the CPUs entry in the
      numa_cpu_lookup_table can result in scenarios where the the topology
      layout of CPUs in the partition does not match the device tree
      or the topology reported by the HMC.
      
      This bug looks like it was introduced in 2004 in the commit titled
      "ppc64: cpu hotplug notifier for numa", which is 6b15e4e87e32 in the
      linux-fullhist tree. Hence tag it for all stable releases.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Reviewed-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1d9a0907
  13. 28 1月, 2018 1 次提交
    • M
      powerpc/cell: Remove axonram driver · 1d65b1c8
      Michael Ellerman 提交于
      The QS21/22 IBM Cell blades had a southbridge chip called Axon. This
      could have DDR DIMMs attached to it, though they were not directly
      usable as RAM, instead they could be used as some sort of buffer, if
      applications were written specifically to use the block device
      provided by the driver.
      
      Although the driver supposedly had direct access support, it was
      apparently never tested (see commit 91117a20 ("axonram: Fix bug in
      direct_access")).
      
      These machines have not been available for over 5 years, and were
      never widely in use. It seems highly unlikely anyone is using this
      driver.
      
      In general we're happy to leave old drivers in the tree, but because
      DAX is involved this driver is caught up in the ongoing work in that
      area, but none of the DAX folks are able to test it.
      
      So remove the driver, if any one *is* using it, we'll be happy to put
      it back.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1d65b1c8
  14. 27 1月, 2018 8 次提交
  15. 24 1月, 2018 1 次提交