1. 23 6月, 2017 1 次提交
  2. 02 5月, 2017 1 次提交
  3. 13 4月, 2017 6 次提交
  4. 21 2月, 2017 1 次提交
    • A
      cxl: fix nested locking hang during EEH hotplug · 171ed0fc
      Andrew Donnellan 提交于
      Commit 14a3ae34 ("cxl: Prevent read/write to AFU config space while AFU
      not configured") introduced a rwsem to fix an invalid memory access that
      occurred when someone attempts to access the config space of an AFU on a
      vPHB whilst the AFU is deconfigured, such as during EEH recovery.
      
      It turns out that it's possible to run into a nested locking issue when EEH
      recovery fails and a full device hotplug is required.
      cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
      configured_rwsem. When EEH recovery fails, the EEH code calls
      pci_hp_remove_devices() to remove the device, which in turn calls
      cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
      to grab the writer lock that's already held.
      
      Standard rwsem semantics don't express what we really want to do here and
      don't allow for nested locking. Fix this by replacing the rwsem with an
      atomic_t which we can control more finely. Allow the AFU to be locked
      multiple times so long as there are no readers.
      
      Fixes: 14a3ae34 ("cxl: Prevent read/write to AFU config space while AFU not configured")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      171ed0fc
  5. 03 2月, 2017 1 次提交
  6. 25 1月, 2017 1 次提交
  7. 18 11月, 2016 1 次提交
    • F
      cxl: Fix coredump generation when cxl_get_fd() is used · bdecf76e
      Frederic Barrat 提交于
      If a process dumps core while owning a cxl file descriptor obtained
      from an AFU driver (e.g. cxlflash) through the cxl_get_fd() API, the
      following error occurs:
      
        [  868.027591] Unable to handle kernel paging request for data at address ...
        [  868.027778] Faulting instruction address: 0xc00000000035edb0
        cpu 0x8c: Vector: 300 (Data Access) at [c000003c688275e0]
            pc: c00000000035edb0: elf_core_dump+0xd60/0x1300
            lr: c00000000035ed80: elf_core_dump+0xd30/0x1300
            sp: c000003c68827860
           msr: 9000000100009033
           dar: c
        dsisr: 40000000
         current = 0xc000003c68780000
         paca    = 0xc000000001b73200   softe: 0        irq_happened: 0x01
            pid   = 46725, comm = hxesurelock
        enter ? for help
        [c000003c68827a60] c00000000036948c do_coredump+0xcec/0x11e0
        [c000003c68827c20] c0000000000ce9e0 get_signal+0x540/0x7b0
        [c000003c68827d10] c000000000017354 do_signal+0x54/0x2b0
        [c000003c68827e00] c00000000001777c do_notify_resume+0xbc/0xd0
        [c000003c68827e30] c000000000009838 ret_from_except_lite+0x64/0x68
        --- Exception: 300 (Data Access) at 00003fff98ad2918
      
      The root cause is that the address_space structure for the file
      doesn't define a 'host' member.
      
      When cxl allocates a file descriptor, it's using the anonymous inode
      to back the file, but allocates a private address_space for each
      context. The private address_space allows to track memory allocation
      for each context. cxl doesn't define the 'host' member of the address
      space, i.e. the inode. We don't want to define it as the anonymous
      inode, since there's no longer a 1-to-1 relation between address_space
      and inode.
      
      To fix it, instead of using the anonymous inode, we introduce a simple
      pseudo filesystem so that cxl can allocate its own inodes. So we now
      have one inode for each file and address_space. The pseudo filesystem
      is only mounted on the first allocation of a file descriptor by
      cxl_get_fd().
      
      Tested with cxlflash.
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bdecf76e
  8. 19 10月, 2016 1 次提交
    • V
      cxl: Prevent adapter reset if an active context exists · 70b565bb
      Vaibhav Jain 提交于
      This patch prevents resetting the cxl adapter via sysfs in presence of
      one or more active cxl_context on it. This protects against an
      unrecoverable error caused by PSL owning a dirty cache line even after
      reset and host tries to touch the same cache line. In case a force reset
      of the card is required irrespective of any active contexts, the int
      value -1 can be stored in the 'reset' sysfs attribute of the card.
      
      The patch introduces a new atomic_t member named contexts_num inside
      struct cxl that holds the number of active context attached to the card
      , which is checked against '0' before proceeding with the reset. To
      prevent against a race condition where a context is activated just after
      reset check is performed, the contexts_num is atomically set to '-1'
      after reset-check to indicate that no more contexts can be activated on
      the card anymore.
      
      Before activating a context we atomically test if contexts_num is
      non-negative and if so, increment its value by one. In case the value of
      contexts_num is negative then it indicates that the card is about to be
      reset and context activation is error-ed out at that point.
      
      Fixes: 62fa19d4 ("cxl: Add ability to reset the card")
      Cc: stable@vger.kernel.org # v4.0+
      Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      70b565bb
  9. 04 10月, 2016 1 次提交
  10. 09 8月, 2016 1 次提交
  11. 14 7月, 2016 5 次提交
    • I
      cxl: Workaround PE=0 hardware limitation in Mellanox CX4 · f67a6722
      Ian Munsie 提交于
      The CX4 card cannot cope with a context with PE=0 due to a hardware
      limitation, resulting in:
      
      [   34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939
      [   34.166580] mlx5_core 0000:01:00.1: Failed allocating uar, aborting
      
      Since the kernel API allocates a default context very early during
      device init that will almost certainly get Process Element ID 0 there is
      no easy way for us to extend the API to allow the Mellanox to inform us
      of this limitation ahead of time.
      
      Instead, work around the issue by extending the XSL structure to include
      a minimum PE to allocate. Although the bug is not in the XSL, it is the
      easiest place to work around this limitation given that the CX4 is
      currently the only card that uses an XSL.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f67a6722
    • I
      cxl: Add support for interrupts on the Mellanox CX4 · a2f67d5e
      Ian Munsie 提交于
      The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
      interrupts are routed from the networking hardware to the XSL using the
      MSIX table, and from there will be transformed back into an MSIX
      interrupt using the cxl style interrupts (i.e. using IVTE entries and
      ranges to map a PE and AFU interrupt number to an MSIX address).
      
      We want to hide the implementation details of cxl interrupts as much as
      possible. To this end, we use a special version of the MSI setup &
      teardown routines in the PHB while in cxl mode to allocate the cxl
      interrupts and configure the IVTE entries in the process element.
      
      This function does not configure the MSIX table - the CX4 card uses a
      custom format in that table and it would not be appropriate to fill that
      out in generic code. The rest of the functionality is similar to the
      "Full MSI-X mode" described in the CAIA, and this could be easily
      extended to support other adapters that use that mode in the future.
      
      The interrupts will be associated with the default context. If the
      maximum number of interrupts per context has been limited (e.g. by the
      mlx5 driver), it will automatically allocate additional kernel contexts
      to associate extra interrupts as required. These contexts will be
      started using the same WED that was used to start the default context.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a2f67d5e
    • I
      cxl: Add preliminary workaround for CX4 interrupt limitation · cbce0917
      Ian Munsie 提交于
      The Mellanox CX4 has a hardware limitation where only 4 bits of the
      AFU interrupt number can be passed to the XSL when sending an interrupt,
      limiting it to only 15 interrupts per context (AFU interrupt number 0 is
      invalid).
      
      In order to overcome this, we will allocate additional contexts linked
      to the default context as extra address space for the extra interrupts -
      this will be implemented in the next patch.
      
      This patch adds the preliminary support to allow this, by way of adding
      a linked list in the context structure that we use to keep track of the
      contexts dedicated to interrupts, and an API to simultaneously iterate
      over the related context structures, AFU interrupt numbers and hardware
      interrupt numbers. The point of using a single API to iterate these is
      to hide some of the details of the iteration from external code, and to
      reduce the number of APIs that need to be exported via base.c to allow
      built in code to call.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      cbce0917
    • I
      cxl: Allow a default context to be associated with an external pci_dev · a19bd79e
      Ian Munsie 提交于
      The cxl kernel API has a concept of a default context associated with
      each PCI device under the virtual PHB. The Mellanox CX4 will also use
      the cxl kernel API, but it does not use a virtual PHB - rather, the AFU
      appears as a physical function as a peer to the networking functions.
      
      In order to allow the kernel API to work with those networking
      functions, we will need to associate a default context with them as
      well. To this end, refactor the corresponding code to do this in vphb.c
      and export it so that it can be called from the PHB code.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a19bd79e
    • I
      cxl: Move cxl_afu_get / cxl_afu_put to base · 62ccf2d2
      Ian Munsie 提交于
      The Mellanox CX4 uses a model where the AFU is one physical function of
      the device, and is used by other peer physical functions of the same
      device. This will require those other devices to grab a reference on the
      AFU when they are initialised to make sure that it does not go away
      during their lifetime.
      
      Move the AFU refcount functions to base.c so they can be called from
      the PHB code.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      62ccf2d2
  12. 08 7月, 2016 2 次提交
    • P
      cxl: Refine slice error debug messages · 6e0c50f9
      Philippe Bergheaud 提交于
      The PSL Slice Error Register (PSL_SERR_An) reports implementation
      dependent AFU errors, in the form of a bitmap. The PSL_SERR_An
      register content is printed in the form of hex dump debug message.
      
      This patch decodes the PSL_ERR_An register contents, and prints a
      specific error message for each possible error bit. It also dumps
      the secondary registers AFU_ERR_An and PSL_DSISR_An, that may
      contain extra debug information.
      
      This patch also removes the large WARN message that used to report
      the cxl slice error interrupt, and replaces it by a short informative
      message, that draws attention to AFU implementation errors.
      Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6e0c50f9
    • I
      cxl: Fix bug where AFU disable operation had no effect · 5e7823c9
      Ian Munsie 提交于
      The AFU disable operation has a bug where it will not clear the enable
      bit and therefore will have no effect. To date this has likely been
      masked by fact that we perform an AFU reset before the disable, which
      also has the effect of clearing the enable bit, making the following
      disable operation effectively a noop on most hardware. This patch
      modifies the afu_control function to take a parameter to clear from the
      AFU control register so that the disable operation can clear the
      appropriate bit.
      
      This bug was uncovered on the Mellanox CX4, which uses an XSL rather
      than a PSL. On the XSL the reset operation will not complete while the
      AFU is enabled, meaning the enable bit was still set at the start of the
      disable and as a result this bug was hit and the disable also timed out.
      
      Because of this difference in behaviour between the PSL and XSL, this
      patch now makes the reset dependent on the card using a PSL to avoid
      waiting for a timeout on the XSL. It is entirely possible that we may be
      able to drop the reset altogether if it turns out we only ever needed it
      due to this bug - however I am not willing to drop it without further
      regression testing and have added comments to the code explaining the
      background.
      
      This also fixes a small issue where the AFU_Cntl register was read
      outside of the lock that protects it.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5e7823c9
  13. 28 6月, 2016 2 次提交
    • M
      cxl: Add set and get private data to context struct · ad42de85
      Michael Neuling 提交于
      This provides AFU drivers a means to associate private data with a cxl
      context. This is particularly intended for make the new callbacks for
      driver specific events easier for AFU drivers to use, as they can easily
      get back to any private data structures they may use.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com
      Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ad42de85
    • P
      cxl: Add mechanism for delivering AFU driver specific events · b810253b
      Philippe Bergheaud 提交于
      This adds an afu_driver_ops structure with fetch_event() and
      event_delivered() callbacks. An AFU driver such as cxlflash can fill
      this out and associate it with a context to enable passing custom AFU
      specific events to userspace.
      
      This also adds a new kernel API function cxl_context_pending_events(),
      that the AFU driver can use to notify the cxl driver that new specific
      events are ready to be delivered, and wake up anyone waiting on the
      context wait queue.
      
      The current count of AFU driver specific events is stored in the field
      afu_driver_events of the context structure.
      
      The cxl driver checks the afu_driver_events count during poll, select,
      read, etc. calls to check if an AFU driver specific event is pending,
      and calls fetch_event() to obtain and deliver that event. This way, the
      cxl driver takes care of all the usual locking semantics around these
      calls and handles all the generic cxl events, so that the AFU driver
      only needs to worry about it's own events.
      
      fetch_event() return a struct cxl_event_afu_driver_reserved, allocated
      by the AFU driver, and filled in with the specific event information and
      size. Total event size (header + data) should not be greater than
      CXL_READ_MIN_SIZE (4K).
      
      Th cxl driver prepends an appropriate cxl event header, copies the event
      to userspace, and finally calls event_delivered() to return the status of
      the operation to the AFU driver. The event is identified by the context
      and cxl_event_afu_driver_reserved pointers.
      
      Since AFU drivers provide their own means for userspace to obtain the
      AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file
      descriptor to obtain the AFU file descriptor) and the generic cxl driver
      will never use this event, the ABI of the event is up to each individual
      AFU driver.
      Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b810253b
  14. 16 6月, 2016 3 次提交
    • I
      cxl: Add support for CAPP DMA mode · b385c9e9
      Ian Munsie 提交于
      This adds support for using CAPP DMA mode, which is required for XSL
      based cards such as the Mellanox CX4 to function.
      
      This is currently an RFC as it depends on the corresponding support to
      be merged into skiboot first, which was submitted here:
      http://patchwork.ozlabs.org/patch/625582/
      
      In the event that the skiboot on the system does not have the above
      support, it will indicate as such in the kernel log and abort the init
      process.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b385c9e9
    • F
      cxl: Abstract the differences between the PSL and XSL · 6d382616
      Frederic Barrat 提交于
      The XSL (Translation Service Layer) is a stripped down version of the
      PSL (Power Service Layer) used in some cards such as the Mellanox CX4.
      
      Like the PSL, it implements the CAIA architecture, but has a number of
      differences, mostly in it's implementation dependent registers. This
      adds an ops structure to abstract these differences to bring initial
      support for XSL CAPI devices.
      
      The XSL does not implement the optional architected SERR register,
      however while it treats it as a reserved register and should work with
      no special treatment, attempting to access it will cause the XSL_FEC
      (First Error Capture) register to be filled out, preventing it from
      capturing any subsequent errors. Therefore, this patch also prevents the
      kernel from trying to set up the SERR register so that the FEC register
      may still be useful, and to save one interrupt.
      
      The XSL also uses a special DMA cxl mode, which uses a slightly
      different init sequence for the CAPP and PHB. The kernel support for
      this will be in a future patch once the corresponding support has been
      merged into skiboot.
      Co-authored-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6d382616
    • I
      cxl: Update process element after allocating interrupts · 292841b0
      Ian Munsie 提交于
      In the kernel API, it is possible to attempt to allocate AFU interrupts
      after already starting a context. Since the process element structure
      used by the hardware is only filled out at the time the context is
      started, it will not be updated with the interrupt numbers that have
      just been allocated and therefore AFU interrupts will not work unless
      they were allocated prior to starting the context.
      
      This can present some difficulties as each CAPI enabled PCI device in
      the kernel API has a default context, which may need to be started very
      early to enable translations, potentially before interrupts can easily
      be set up.
      
      This patch makes the API more flexible to allow interrupts to be
      allocated after a context has already been started and takes care of
      updating the PE structure used by the hardware and notifying it to
      discard any cached copy it may have.
      
      The update is currently performed via a terminate/remove/add sequence.
      This is necessary on some hardware such as the XSL that does not
      properly support the update LLCMD.
      
      Note that this is only supported on powernv at present - attempting to
      perform this ordering on PowerVM will raise a warning.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      292841b0
  15. 11 5月, 2016 3 次提交
    • C
      cxl: Check periodically the coherent platform function's state · 266eab8f
      Christophe Lombard 提交于
      In the PowerVM environment, the PHYP CoherentAccel component manages
      the state of the Coherent Accelerator Processor Interface adapter and
      virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and
      interrupts - and provides a new set of hcalls for the OS APIs to utilize
      Accelerator Function Unit (AFU).
      
      During the course of operation, a coherent platform function can
      encounter errors. Some possible reason for errors are:
      • Hardware recoverable and unrecoverable errors
      • Transient and over-threshold correctable errors
      
      PHYP implements its own state model for the coherent platform function.
      The state of the AFU is available through a hcall.
      
      The current implementation of the cxl driver, for the PowerVM
      environment, checks this state of the AFU only when an action is
      requested - open a device, ioctl command, memory map, attach/detach a
      process - from an external driver - cxlflash, libcxl. If an error is
      detected the cxl driver handles the error according the content of the
      Power Architecture Platform Requirements document.
      
      But in case of low-level troubles (or error injection), the PHYP
      component may reset the card and change the AFU state. The PHYP
      interface doesn't provide any way to be notified when that happens thus
      implies that the cxl driver:
      • cannot handle immediatly the state change of the AFU.
      • cannot notify other drivers (cxlflash, ...)
      
      The purpose of this patch is to wake up the cpu periodically to check
      the current state of each AFU and to see if we need to enter an error
      recovery path.
      Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      266eab8f
    • I
      cxl: Add kernel API to allow a context to operate with relocate disabled · 7a0d85d3
      Ian Munsie 提交于
      cxl devices typically access memory using an MMU in much the same way as
      the CPU, and each context includes a state register much like the MSR in
      the CPU. Like the CPU, the state register includes a bit to enable
      relocation, which we currently always enable.
      
      In some cases, it may be desirable to allow a device to access memory
      using real addresses instead of effective addresses, so this adds a new
      API, cxl_set_translation_mode, that can be used to disable relocation
      on a given kernel context. This can allow for the creation of a special
      privileged context that the device can use if it needs relocation
      disabled, and can use regular contexts at times when it needs relocation
      enabled.
      
      This interface is only available to users of the kernel API for obvious
      reasons, and will never be supported in a virtualised environment.
      
      This will be used by the upcoming cxl support in the mlx5 driver.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7a0d85d3
    • I
      cxl: Remove duplicate #defines · 0e5b5ba1
      Ian Munsie 提交于
      These defines are not used, but other equivalent definitions
      (CXL_SPA_SW_CMD_*) are used. Remove the unused defines.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0e5b5ba1
  16. 27 4月, 2016 1 次提交
  17. 22 4月, 2016 1 次提交
  18. 09 3月, 2016 8 次提交