1. 24 10月, 2016 1 次提交
  2. 19 10月, 2016 1 次提交
    • V
      cxl: Prevent adapter reset if an active context exists · 70b565bb
      Vaibhav Jain 提交于
      This patch prevents resetting the cxl adapter via sysfs in presence of
      one or more active cxl_context on it. This protects against an
      unrecoverable error caused by PSL owning a dirty cache line even after
      reset and host tries to touch the same cache line. In case a force reset
      of the card is required irrespective of any active contexts, the int
      value -1 can be stored in the 'reset' sysfs attribute of the card.
      
      The patch introduces a new atomic_t member named contexts_num inside
      struct cxl that holds the number of active context attached to the card
      , which is checked against '0' before proceeding with the reset. To
      prevent against a race condition where a context is activated just after
      reset check is performed, the contexts_num is atomically set to '-1'
      after reset-check to indicate that no more contexts can be activated on
      the card anymore.
      
      Before activating a context we atomically test if contexts_num is
      non-negative and if so, increment its value by one. In case the value of
      contexts_num is negative then it indicates that the card is about to be
      reset and context activation is error-ed out at that point.
      
      Fixes: 62fa19d4 ("cxl: Add ability to reset the card")
      Cc: stable@vger.kernel.org # v4.0+
      Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      70b565bb
  3. 14 7月, 2016 4 次提交
    • I
      cxl: Add support for interrupts on the Mellanox CX4 · a2f67d5e
      Ian Munsie 提交于
      The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
      interrupts are routed from the networking hardware to the XSL using the
      MSIX table, and from there will be transformed back into an MSIX
      interrupt using the cxl style interrupts (i.e. using IVTE entries and
      ranges to map a PE and AFU interrupt number to an MSIX address).
      
      We want to hide the implementation details of cxl interrupts as much as
      possible. To this end, we use a special version of the MSI setup &
      teardown routines in the PHB while in cxl mode to allocate the cxl
      interrupts and configure the IVTE entries in the process element.
      
      This function does not configure the MSIX table - the CX4 card uses a
      custom format in that table and it would not be appropriate to fill that
      out in generic code. The rest of the functionality is similar to the
      "Full MSI-X mode" described in the CAIA, and this could be easily
      extended to support other adapters that use that mode in the future.
      
      The interrupts will be associated with the default context. If the
      maximum number of interrupts per context has been limited (e.g. by the
      mlx5 driver), it will automatically allocate additional kernel contexts
      to associate extra interrupts as required. These contexts will be
      started using the same WED that was used to start the default context.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a2f67d5e
    • I
      cxl: Add preliminary workaround for CX4 interrupt limitation · cbce0917
      Ian Munsie 提交于
      The Mellanox CX4 has a hardware limitation where only 4 bits of the
      AFU interrupt number can be passed to the XSL when sending an interrupt,
      limiting it to only 15 interrupts per context (AFU interrupt number 0 is
      invalid).
      
      In order to overcome this, we will allocate additional contexts linked
      to the default context as extra address space for the extra interrupts -
      this will be implemented in the next patch.
      
      This patch adds the preliminary support to allow this, by way of adding
      a linked list in the context structure that we use to keep track of the
      contexts dedicated to interrupts, and an API to simultaneously iterate
      over the related context structures, AFU interrupt numbers and hardware
      interrupt numbers. The point of using a single API to iterate these is
      to hide some of the details of the iteration from external code, and to
      reduce the number of APIs that need to be exported via base.c to allow
      built in code to call.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      cbce0917
    • I
      cxl: Add kernel APIs to get & set the max irqs per context · 79384e4b
      Ian Munsie 提交于
      These APIs will be used by the Mellanox CX4 support. While they function
      standalone to configure existing behaviour, their primary purpose is to
      allow the Mellanox driver to inform the cxl driver of a hardware
      limitation, which will be used in a future patch.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      79384e4b
    • I
      cxl: Add support for using the kernel API with a real PHB · 317f5ef1
      Ian Munsie 提交于
      This hooks up support for using the kernel API with a real PHB. After
      the AFU initialisation has completed it calls into the PHB code to pass
      it the AFU that will be used by other peer physical functions on the
      adapter.
      
      The cxl_pci_to_afu API is extended to work with peer PCI devices,
      retrieving the peer AFU from the PHB. This API may also now return an
      error if it is called on a PCI device that is not associated with either
      a cxl vPHB or a peer PCI device to an AFU, and this error is propagated
      down.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      317f5ef1
  4. 28 6月, 2016 2 次提交
    • M
      cxl: Add set and get private data to context struct · ad42de85
      Michael Neuling 提交于
      This provides AFU drivers a means to associate private data with a cxl
      context. This is particularly intended for make the new callbacks for
      driver specific events easier for AFU drivers to use, as they can easily
      get back to any private data structures they may use.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com
      Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ad42de85
    • P
      cxl: Add mechanism for delivering AFU driver specific events · b810253b
      Philippe Bergheaud 提交于
      This adds an afu_driver_ops structure with fetch_event() and
      event_delivered() callbacks. An AFU driver such as cxlflash can fill
      this out and associate it with a context to enable passing custom AFU
      specific events to userspace.
      
      This also adds a new kernel API function cxl_context_pending_events(),
      that the AFU driver can use to notify the cxl driver that new specific
      events are ready to be delivered, and wake up anyone waiting on the
      context wait queue.
      
      The current count of AFU driver specific events is stored in the field
      afu_driver_events of the context structure.
      
      The cxl driver checks the afu_driver_events count during poll, select,
      read, etc. calls to check if an AFU driver specific event is pending,
      and calls fetch_event() to obtain and deliver that event. This way, the
      cxl driver takes care of all the usual locking semantics around these
      calls and handles all the generic cxl events, so that the AFU driver
      only needs to worry about it's own events.
      
      fetch_event() return a struct cxl_event_afu_driver_reserved, allocated
      by the AFU driver, and filled in with the specific event information and
      size. Total event size (header + data) should not be greater than
      CXL_READ_MIN_SIZE (4K).
      
      Th cxl driver prepends an appropriate cxl event header, copies the event
      to userspace, and finally calls event_delivered() to return the status of
      the operation to the AFU driver. The event is identified by the context
      and cxl_event_afu_driver_reserved pointers.
      
      Since AFU drivers provide their own means for userspace to obtain the
      AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file
      descriptor to obtain the AFU file descriptor) and the generic cxl driver
      will never use this event, the ABI of the event is up to each individual
      AFU driver.
      Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b810253b
  5. 16 6月, 2016 1 次提交
    • I
      cxl: Update process element after allocating interrupts · 292841b0
      Ian Munsie 提交于
      In the kernel API, it is possible to attempt to allocate AFU interrupts
      after already starting a context. Since the process element structure
      used by the hardware is only filled out at the time the context is
      started, it will not be updated with the interrupt numbers that have
      just been allocated and therefore AFU interrupts will not work unless
      they were allocated prior to starting the context.
      
      This can present some difficulties as each CAPI enabled PCI device in
      the kernel API has a default context, which may need to be started very
      early to enable translations, potentially before interrupts can easily
      be set up.
      
      This patch makes the API more flexible to allow interrupts to be
      allocated after a context has already been started and takes care of
      updating the PE structure used by the hardware and notifying it to
      discard any cached copy it may have.
      
      The update is currently performed via a terminate/remove/add sequence.
      This is necessary on some hardware such as the XSL that does not
      properly support the update LLCMD.
      
      Note that this is only supported on powernv at present - attempting to
      perform this ordering on PowerVM will raise a warning.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      292841b0
  6. 11 5月, 2016 1 次提交
    • I
      cxl: Add kernel API to allow a context to operate with relocate disabled · 7a0d85d3
      Ian Munsie 提交于
      cxl devices typically access memory using an MMU in much the same way as
      the CPU, and each context includes a state register much like the MSR in
      the CPU. Like the CPU, the state register includes a bit to enable
      relocation, which we currently always enable.
      
      In some cases, it may be desirable to allow a device to access memory
      using real addresses instead of effective addresses, so this adds a new
      API, cxl_set_translation_mode, that can be used to disable relocation
      on a given kernel context. This can allow for the creation of a special
      privileged context that the device can use if it needs relocation
      disabled, and can use regular contexts at times when it needs relocation
      enabled.
      
      This interface is only available to users of the kernel API for obvious
      reasons, and will never be supported in a virtualised environment.
      
      This will be used by the upcoming cxl support in the mlx5 driver.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7a0d85d3
  7. 11 4月, 2016 1 次提交
  8. 09 3月, 2016 5 次提交
  9. 05 1月, 2016 1 次提交
  10. 24 11月, 2015 1 次提交
    • V
      cxl: Fix possible idr warning when contexts are released · 1b5df59e
      Vaibhav Jain 提交于
      An idr warning is reported when a context is release after the capi card
      is unbound from the cxl driver via sysfs. Below are the steps to
      reproduce:
      
      1. Create multiple afu contexts in an user-space application using libcxl.
      2. Unbind capi card from cxl using command of form
         echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind
      3. Exit/kill the application owning afu contexts.
      
      After above steps a warning message is usually seen in the kernel logs
      of the form "idr_remove called for id=<context-id> which is not
      allocated."
      
      This is caused by the function cxl_release_afu which destroys the
      contexts_idr table. So when a context is release no entry for context pe
      is found in the contexts_idr table and idr code prints this warning.
      
      This patch fixes this issue by increasing & decreasing the ref-count on
      the afu device when a context is initialized or when its freed
      respectively. This prevents the afu from being released until all the
      afu contexts have been released. The patch introduces two new functions
      namely cxl_afu_get/put that manage the ref-count on the afu device.
      
      Also the patch removes code inside cxl_dev_context_init that increases ref
      on the afu device as its guaranteed to be alive during this function.
      Reported-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1b5df59e
  11. 01 10月, 2015 1 次提交
  12. 30 8月, 2015 2 次提交
    • I
      cxl: Fix force unmapping mmaps of contexts allocated through the kernel api · 55e07668
      Ian Munsie 提交于
      The cxl user api uses the address_space associated with the file when we
      need to force unmap all cxl mmap regions (e.g. on eeh, driver detach,
      etc). Currently, contexts allocated through the kernel api do not do
      this and instead skip the mmap invalidation, potentially allowing them
      to poke at the hardware after such an event, which may cause all sorts
      of trouble.
      
      This patch allocates an address_space for cxl contexts allocated through
      the kernel api so that the same invalidate path will for these contexts
      as well. We don't use the anonymous inode's address_space, as doing so
      could invalidate any mmaps of completely unrelated drivers using
      anonymous file descriptors.
      
      This patch also introduces a kernelapi flag, so we know when freeing the
      context if the address_space was allocated by us and needs to be freed.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      55e07668
    • I
      cxl: Fix + cleanup error paths in cxl_dev_context_init · af2a50bb
      Ian Munsie 提交于
      If the cxl_context_alloc() call fails, we return immediately without
      releasing the reference on the AFU device, allowing it to leak.
      
      This patch switches to using goto style error handling so that the
      device is released in common code for both error paths, and will also
      simplify things if we add additional initialisation in this function in
      the future.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      af2a50bb
  13. 20 8月, 2015 1 次提交
    • A
      cxl: Allow release of contexts which have been OPENED but not STARTED · 7c26b9cf
      Andrew Donnellan 提交于
      If we open a context but do not start it (either because we do not attempt
      to start it, or because it fails to start for some reason), we are left
      with a context in state OPENED. Previously, cxl_release_context() only
      allowed releasing contexts in state CLOSED, so attempting to release an
      OPENED context would fail.
      
      In particular, this bug causes available contexts to run out after some EEH
      failures, where drivers attempt to release contexts that have failed to
      start.
      
      Allow releasing contexts in any state with a value lower than STARTED, i.e.
      OPENED or CLOSED (we can't release a STARTED context as it's currently
      using the hardware, and we assume that contexts in any new states which may
      be added in future with a value higher than STARTED are also unsafe to
      release).
      
      Cc: stable@vger.kernel.org
      Fixes: 6f7f0b3d ("cxl: Add AFU virtual PHB and kernel API")
      Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7c26b9cf
  14. 14 8月, 2015 1 次提交
  15. 07 7月, 2015 1 次提交
  16. 03 6月, 2015 1 次提交
    • M
      cxl: Add AFU virtual PHB and kernel API · 6f7f0b3d
      Michael Neuling 提交于
      This patch does two things.
      
      Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER
      Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB).  This
      in in addition to the PSL being a PCI device itself.
      
      As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can
      provide an AFU configuration.  This AFU configuration recored is architected to
      be the same as a PCI config space.
      
      This patch sets discovers the AFU configuration records, provides AFU config
      space read/write functions to these configuration records.  It then enumerates
      the PCI bus.  It also hooks in PCI ops where appropriate.  It also destroys the
      vPHB when the physical card is removed.
      
      Secondly, it add an in kernel API for AFU to use CXL.  AFUs must present a
      driver that firstly binds as a PCI device.  This PCI device can then be using
      to do CXL specific operations (that can't sit in the PCI ops) using this API.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6f7f0b3d