1. 23 3月, 2018 1 次提交
  2. 09 2月, 2018 1 次提交
  3. 06 2月, 2018 2 次提交
    • M
      membarrier: Provide GLOBAL_EXPEDITED command · c5f58bd5
      Mathieu Desnoyers 提交于
      Allow expedited membarrier to be used for data shared between processes
      through shared memory.
      
      Processes wishing to receive the membarriers register with
      MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. Those which want to issue
      membarrier invoke MEMBARRIER_CMD_GLOBAL_EXPEDITED.
      
      This allows extremely simple kernel-level implementation: we have almost
      everything we need with the PRIVATE_EXPEDITED barrier code. All we need
      to do is to add a flag in the mm_struct that will be used to check
      whether we need to send the IPI to the current thread of each CPU.
      
      There is a slight downside to this approach compared to targeting
      specific shared memory users: when performing a membarrier operation,
      all registered "global" receivers will get the barrier, even if they
      don't share a memory mapping with the sender issuing
      MEMBARRIER_CMD_GLOBAL_EXPEDITED.
      
      This registration approach seems to fit the requirement of not
      disturbing processes that really deeply care about real-time: they
      simply should not register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED.
      
      In order to align the membarrier command names, the "MEMBARRIER_CMD_SHARED"
      command is renamed to "MEMBARRIER_CMD_GLOBAL", keeping an alias of
      MEMBARRIER_CMD_SHARED to MEMBARRIER_CMD_GLOBAL for UAPI header backward
      compatibility.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrea Parri <parri.andrea@gmail.com>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Avi Kivity <avi@scylladb.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: David Sehr <sehr@google.com>
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maged Michael <maged.michael@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-api@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180129202020.8515-5-mathieu.desnoyers@efficios.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c5f58bd5
    • M
      powerpc, membarrier: Skip memory barrier in switch_mm() · 3ccfebed
      Mathieu Desnoyers 提交于
      Allow PowerPC to skip the full memory barrier in switch_mm(), and
      only issue the barrier when scheduling into a task belonging to a
      process that has registered to use expedited private.
      
      Threads targeting the same VM but which belong to different thread
      groups is a tricky case. It has a few consequences:
      
      It turns out that we cannot rely on get_nr_threads(p) to count the
      number of threads using a VM. We can use
      (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)
      instead to skip the synchronize_sched() for cases where the VM only has
      a single user, and that user only has a single thread.
      
      It also turns out that we cannot use for_each_thread() to set
      thread flags in all threads using a VM, as it only iterates on the
      thread group.
      
      Therefore, test the membarrier state variable directly rather than
      relying on thread flags. This means
      membarrier_register_private_expedited() needs to set the
      MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and
      only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows
      private expedited membarrier commands to succeed.
      membarrier_arch_switch_mm() now tests for the
      MEMBARRIER_STATE_PRIVATE_EXPEDITED flag.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrea Parri <parri.andrea@gmail.com>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Avi Kivity <avi@scylladb.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: David Sehr <sehr@google.com>
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maged Michael <maged.michael@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-api@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129202020.8515-3-mathieu.desnoyers@efficios.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3ccfebed
  4. 01 2月, 2018 3 次提交
  5. 30 1月, 2018 1 次提交
    • M
      powerpc/mm/radix: Fix build error when RADIX_MMU=n · 015eb1b8
      Michael Ellerman 提交于
      The recent TLB flush rework broke the build when the Radix MMU is
      disabled at build time, eg:
      
        (.text+0x264): undefined reference to `.radix__tlbiel_all'
      
      We could add an empty version, but if we ever called it by accident
      that would indicate a bad bug, so add a stub that just WARNs if we do.
      
      Fixes: d4748276 ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      015eb1b8
  6. 27 1月, 2018 5 次提交
  7. 24 1月, 2018 4 次提交
    • F
      ocxl: Add AFU interrupt support · aeddad17
      Frederic Barrat 提交于
      Add user APIs through ioctl to allocate, free, and be notified of an
      AFU interrupt.
      
      For opencapi, an AFU can trigger an interrupt on the host by sending a
      specific command targeting a 64-bit object handle. On POWER9, this is
      implemented by mapping a special page in the address space of a
      process and a write to that page will trigger an interrupt.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aeddad17
    • F
      powerpc/powernv: Capture actag information for the device · 2cb3d64b
      Frederic Barrat 提交于
      In the opencapi protocol, host memory contexts are referenced by a
      'actag'. During setup, a driver must tell the device how many actags
      it can used, and what values are acceptable.
      
      On POWER9, the NPU can handle 64 actags per link, so they must be
      shared between all the PCI functions of the link. To get a global
      picture of how many actags are used by each AFU of every function, we
      capture some data at the end of PCI enumeration, so that actags can be
      shared fairly if needed.
      
      This is not powernv specific per say, but rather a consequence of the
      opencapi configuration specification being quite general. The number
      of available actags on POWER9 makes it more likely to be hit. This is
      somewhat mitigated by the fact that existing AFUs are coded by
      requesting a reasonable count of actags and existing devices carry
      only one AFU.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2cb3d64b
    • F
      powerpc/powernv: Add platform-specific services for opencapi · 6914c757
      Frederic Barrat 提交于
      Implement a few platform-specific calls which can be used by drivers:
      
      - provide the Transaction Layer capabilities of the host, so that the
        driver can find some common ground and configure the device and host
        appropriately.
      
      - provide the hw interrupt to be used for translation faults raised by
        the NPU
      
      - map/unmap some NPU mmio registers to get the fault context when the
        NPU raises an address translation fault
      
      The rest are wrappers around the previously-introduced opal calls.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6914c757
    • F
      powerpc/powernv: Add opal calls for opencapi · 74d656d2
      Frederic Barrat 提交于
      Add opal calls to interact with the NPU:
      
      OPAL_NPU_SPA_SETUP: set the Shared Process Area (SPA)
      The SPA is a table containing one entry (Process Element) per memory
      context which can be accessed by the opencapi device.
      
      OPAL_NPU_SPA_CLEAR_CACHE: clear the context cache
      The NPU keeps a cache of recently accessed memory contexts. When a
      Process Element is removed from the SPA, the cache for the link must
      be cleared.
      
      OPAL_NPU_TL_SET: configure the Transaction Layer
      The Transaction Layer specification defines several templates for
      messages to be exchanged on the link. During link setup, the host and
      device must negotiate what templates are supported on both sides and
      at what rates those messages can be sent.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Acked-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      74d656d2
  8. 23 1月, 2018 2 次提交
  9. 22 1月, 2018 1 次提交
    • N
      powerpc/pseries, ps3: panic flush kernel messages before halting system · 35adacd6
      Nicholas Piggin 提交于
      Platforms with a panic handler that halts the system can have problems
      getting kernel messages out, because the panic notifiers are called
      before kernel/panic.c does its flushing of printk buffers an console
      etc.
      
      This was attempted to be solved with commit a3b2cb30 ("powerpc: Do
      not call ppc_md.panic in fadump panic notifier"), but that wasn't the
      right approach and caused other problems, and was reverted by commit
      ab9dbf77.
      
      Instead, the powernv shutdown paths have already had a similar
      problem, fixed by taking the message flushing sequence from
      kernel/panic.c. That's a little bit ugly, but while we have the code
      duplicated, it will work for this case as well. So have ppc panic
      handlers do the same flushing before they terminate.
      
      Without this patch, a qemu pseries_le_defconfig guest stops silently
      when issued the nmi command when xmon is off and no crash dumpers
      enabled. Afterwards, an oops is printed by each CPU as expected.
      
      Fixes: ab9dbf77 ("Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      35adacd6
  10. 21 1月, 2018 6 次提交
  11. 20 1月, 2018 14 次提交