1. 29 12月, 2014 1 次提交
    • I
      cxl: Disable AFU debug flag · d6a6af2c
      Ian Munsie 提交于
      Upon inspection of the implementation specific registers, it was
      discovered that the high bit of the implementation specific RXCTL
      register was enabled, which enables the DEADB00F debug feature.
      
      The debug feature causes MMIO reads to a disabled AFU to respond with
      0xDEADB00F instead of all Fs. In general this should not be visible as
      the kernel will only allow MMIO access to enabled AFUs, but there may be
      some circumstances where an AFU may become disabled while it is use.
      One such case would be an AFU designed to only be used in the dedicated
      process mode and to disable itself after it has completed it's work
      (however even in that case the effects of this debug flag would be
      limited as the userspace application must have completed any required
      MMIO accesses before the AFU disables itself with or without the flag).
      
      This patch removes the debug flag and replaces the magic value
      programmed into this register with a preprocessor define so it is
      clearer what the rest of this initialisation does.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d6a6af2c
  2. 12 12月, 2014 2 次提交
    • I
      cxl: Unmap MMIO regions when detaching a context · b123429e
      Ian Munsie 提交于
      If we need to force detach a context (e.g. due to EEH or simply force
      unbinding the driver) we should prevent the userspace contexts from
      being able to access the Problem State Area MMIO region further, which
      they may have mapped with mmap().
      
      This patch unmaps any mapped MMIO regions when detaching a userspace
      context.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b123429e
    • I
      cxl: Change contexts_lock to a mutex to fix sleep while atomic bug · ee41d11d
      Ian Munsie 提交于
      We had a known sleep while atomic bug if a CXL device was forcefully
      unbound while it was in use. This could occur as a result of EEH, or
      manually induced with something like this while the device was in use:
      
      echo 0000:01:00.0 > /sys/bus/pci/drivers/cxl-pci/unbind
      
      The issue was that in this code path we iterated over each context and
      forcefully detached it with the contexts_lock spin lock held, however
      the detach also needed to take the spu_mutex, and call schedule.
      
      This patch changes the contexts_lock to a mutex so that we are not in
      atomic context while doing the detach, thereby avoiding the sleep while
      atomic.
      
      Also delete the related TODO comment, which suggested an alternate
      solution which turned out to not be workable.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ee41d11d
  3. 27 11月, 2014 2 次提交
    • M
      cxl: Name interrupts in /proc/interrupt · ca4f8280
      Michael Neuling 提交于
      Currently all interrupts generated by cxl are named "cxl".  This is not very
      informative as we can't distinguish between cards, AFUs, error interrupts, user
      contexts and user interrupts numbers.  Being able to distinguish them is useful
      for setting affinity.
      
      This patch gives each of these names in /proc/interrupts.
      
      A two card CAPI system, with afu0.0 having 2 active contexts each with 4 user
      IRQs each, will now look like this:
      
          % grep cxl /proc/interrupts
          444:          0  OPAL ICS 141312 Level     cxl-card1-err
          445:          0  OPAL ICS 141313 Level     cxl-afu1.0-err
          446:          0  OPAL ICS 141314 Level     cxl-afu1.0
          462:          0  OPAL ICS 2052 Level     cxl-afu0.0-pe0-1
          463:      75517  OPAL ICS 2053 Level     cxl-afu0.0-pe0-2
          468:          0  OPAL ICS 2054 Level     cxl-afu0.0-pe0-3
          469:          0  OPAL ICS 2055 Level     cxl-afu0.0-pe0-4
          470:          0  OPAL ICS 2056 Level     cxl-afu0.0-pe1-1
          471:      75506  OPAL ICS 2057 Level     cxl-afu0.0-pe1-2
          472:          0  OPAL ICS 2058 Level     cxl-afu0.0-pe1-3
          473:          0  OPAL ICS 2059 Level     cxl-afu0.0-pe1-4
          502:       1066  OPAL ICS 2050 Level     cxl-afu0.0
          514:          0  OPAL ICS 2048 Level     cxl-card0-err
          515:          0  OPAL ICS 2049 Level     cxl-afu0.0-err
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca4f8280
    • I
      CXL: Return error to PSL if IRQ demultiplexing fails & print clearer warning · 27bbcef2
      Ian Munsie 提交于
      If an AFU has a hardware bug that causes it to acknowledge a context
      terminate or remove while that context has outstanding transactions, it
      is possible for the kernel to receive an interrupt for that context
      after we have removed it from the context list.
      
      The kernel will not be able to demultiplex the interrupt (or worse - if
      we have already reallocated the process handle we could mis-attribute it
      to the new context), and printed a big scary warning.
      
      It did not acknowledge the interrupt, which would effectively halt
      further translation fault processing on the PSL.
      
      This patch makes the warning clearer about the likely cause of the issue
      (i.e. hardware bug) to make it obvious to future AFU designers of what
      needs to be fixed. It also prints out the process handle which can then
      be matched up with hardware and software traces for debugging.
      
      It also acknowledges the interrupt to the PSL with either an address
      error or acknowledge, so that the PSL can continue with other
      translations.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27bbcef2
  4. 18 11月, 2014 2 次提交
    • M
      cxl: Name interrupts in /proc/interrupt · 80fa93fc
      Michael Neuling 提交于
      Currently all interrupts generated by cxl are named "cxl".  This is not very
      informative as we can't distinguish between cards, AFUs, error interrupts, user
      contexts and user interrupts numbers.  Being able to distinguish them is useful
      for setting affinity.
      
      This patch gives each of these names in /proc/interrupts.
      
      A two card CAPI system, with afu0.0 having 2 active contexts each with 4 user
      IRQs each, will now look like this:
      
          % grep cxl /proc/interrupts
          444:          0  OPAL ICS 141312 Level     cxl-card1-err
          445:          0  OPAL ICS 141313 Level     cxl-afu1.0-err
          446:          0  OPAL ICS 141314 Level     cxl-afu1.0
          462:          0  OPAL ICS 2052 Level     cxl-afu0.0-pe0-1
          463:      75517  OPAL ICS 2053 Level     cxl-afu0.0-pe0-2
          468:          0  OPAL ICS 2054 Level     cxl-afu0.0-pe0-3
          469:          0  OPAL ICS 2055 Level     cxl-afu0.0-pe0-4
          470:          0  OPAL ICS 2056 Level     cxl-afu0.0-pe1-1
          471:      75506  OPAL ICS 2057 Level     cxl-afu0.0-pe1-2
          472:          0  OPAL ICS 2058 Level     cxl-afu0.0-pe1-3
          473:          0  OPAL ICS 2059 Level     cxl-afu0.0-pe1-4
          502:       1066  OPAL ICS 2050 Level     cxl-afu0.0
          514:          0  OPAL ICS 2048 Level     cxl-card0-err
          515:          0  OPAL ICS 2049 Level     cxl-afu0.0-err
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      80fa93fc
    • I
      cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning · bc78b05b
      Ian Munsie 提交于
      If an AFU has a hardware bug that causes it to acknowledge a context
      terminate or remove while that context has outstanding transactions, it
      is possible for the kernel to receive an interrupt for that context
      after we have removed it from the context list.
      
      The kernel will not be able to demultiplex the interrupt (or worse - if
      we have already reallocated the process handle we could mis-attribute it
      to the new context), and printed a big scary warning.
      
      It did not acknowledge the interrupt, which would effectively halt
      further translation fault processing on the PSL.
      
      This patch makes the warning clearer about the likely cause of the issue
      (i.e. hardware bug) to make it obvious to future AFU designers of what
      needs to be fixed. It also prints out the process handle which can then
      be matched up with hardware and software traces for debugging.
      
      It also acknowledges the interrupt to the PSL with either an address
      error or acknowledge, so that the PSL can continue with other
      translations.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bc78b05b
  5. 08 10月, 2014 1 次提交
    • I
      cxl: Driver code for powernv PCIe based cards for userspace access · f204e0b8
      Ian Munsie 提交于
      This is the core of the cxl driver.
      
      It adds support for using cxl cards in the powernv environment only (ie POWER8
      bare metal). It allows access to cxl accelerators by userspace using the
      /dev/cxl/afuM.N char devices.
      
      The kernel driver has no knowledge of the function implemented by the
      accelerator. It provides services to userspace via the /dev/cxl/afuM.N
      devices. When a program opens this device and runs the start work IOCTL, the
      accelerator will have coherent access to that processes memory using the same
      virtual addresses. That process may mmap the device to access any MMIO space
      the accelerator provides.  Also, reads on the device will allow interrupts to
      be received. These services are further documented in a later patch in
      Documentation/powerpc/cxl.txt.
      
      Documentation of the cxl hardware architecture and userspace API is provided in
      subsequent patches.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f204e0b8