1. 14 7月, 2016 2 次提交
  2. 08 7月, 2016 3 次提交
    • P
      cxl: Ignore CAPI adapters misplaced in switched slots · 3b3dcd61
      Philippe Bergheaud 提交于
      One should not attempt to switch a PHB into CAPI mode if there is
      a switch between the PHB and the adapter. This patch modifies the
      cxl driver to ignore CAPI adapters misplaced in switched slots.
      Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3b3dcd61
    • I
      cxl: Fix bug where AFU disable operation had no effect · 5e7823c9
      Ian Munsie 提交于
      The AFU disable operation has a bug where it will not clear the enable
      bit and therefore will have no effect. To date this has likely been
      masked by fact that we perform an AFU reset before the disable, which
      also has the effect of clearing the enable bit, making the following
      disable operation effectively a noop on most hardware. This patch
      modifies the afu_control function to take a parameter to clear from the
      AFU control register so that the disable operation can clear the
      appropriate bit.
      
      This bug was uncovered on the Mellanox CX4, which uses an XSL rather
      than a PSL. On the XSL the reset operation will not complete while the
      AFU is enabled, meaning the enable bit was still set at the start of the
      disable and as a result this bug was hit and the disable also timed out.
      
      Because of this difference in behaviour between the PSL and XSL, this
      patch now makes the reset dependent on the card using a PSL to avoid
      waiting for a timeout on the XSL. It is entirely possible that we may be
      able to drop the reset altogether if it turns out we only ever needed it
      due to this bug - however I am not willing to drop it without further
      regression testing and have added comments to the code explaining the
      background.
      
      This also fixes a small issue where the AFU_Cntl register was read
      outside of the lock that protects it.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5e7823c9
    • I
      cxl: Fix allowing bogus AFU descriptors with 0 maximum processes · 49e9c99f
      Ian Munsie 提交于
      If the AFU descriptor of an AFU directed AFU indicates that it supports
      0 maximum processes, we will accept that value and attempt to use it.
      The SPA will still be allocated (with 2 pages due to another minor bug
      and room for 958 processes), and when a context is allocated we will
      pass the value of 0 to idr_alloc as the maximum. However, idr_alloc will
      treat that as meaning no maximum and will allocate a context number and
      we return a valid context.
      
      Conceivably, this could lead to a buffer overflow of the SPA if more
      than 958 contexts were allocated, however this is mitigated by the fact
      that there are no known AFUs in the wild with a bogus AFU descriptor
      like this, and that only the root user is allowed to flash an AFU image
      to a card.
      
      Add a check when validating the AFU descriptor to reject any with 0
      maximum processes.
      
      We do still allow a dedicated process only AFU to indicate that it
      supports 0 contexts even though that is forbidden in the architecture,
      as in that case we ignore the value and use 1 instead. This is just on
      the off-chance that such a dedicated process AFU may exist (not that I
      am aware of any), since their developers are less likely to have cared
      about this value at all.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      49e9c99f
  3. 16 6月, 2016 2 次提交
    • I
      cxl: Add support for CAPP DMA mode · b385c9e9
      Ian Munsie 提交于
      This adds support for using CAPP DMA mode, which is required for XSL
      based cards such as the Mellanox CX4 to function.
      
      This is currently an RFC as it depends on the corresponding support to
      be merged into skiboot first, which was submitted here:
      http://patchwork.ozlabs.org/patch/625582/
      
      In the event that the skiboot on the system does not have the above
      support, it will indicate as such in the kernel log and abort the init
      process.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b385c9e9
    • F
      cxl: Abstract the differences between the PSL and XSL · 6d382616
      Frederic Barrat 提交于
      The XSL (Translation Service Layer) is a stripped down version of the
      PSL (Power Service Layer) used in some cards such as the Mellanox CX4.
      
      Like the PSL, it implements the CAIA architecture, but has a number of
      differences, mostly in it's implementation dependent registers. This
      adds an ops structure to abstract these differences to bring initial
      support for XSL CAPI devices.
      
      The XSL does not implement the optional architected SERR register,
      however while it treats it as a reserved register and should work with
      no special treatment, attempting to access it will cause the XSL_FEC
      (First Error Capture) register to be filled out, preventing it from
      capturing any subsequent errors. Therefore, this patch also prevents the
      kernel from trying to set up the SERR register so that the FEC register
      may still be useful, and to save one interrupt.
      
      The XSL also uses a special DMA cxl mode, which uses a slightly
      different init sequence for the CAPP and PHB. The kernel support for
      this will be in a future patch once the corresponding support has been
      merged into skiboot.
      Co-authored-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6d382616
  4. 22 4月, 2016 2 次提交
  5. 11 4月, 2016 1 次提交
  6. 09 3月, 2016 7 次提交
  7. 29 2月, 2016 1 次提交
  8. 06 2月, 2016 1 次提交
    • B
      PCI: Remove includes of asm/pci-bridge.h · 952bbcb0
      Bjorn Helgaas 提交于
      Drivers should include asm/pci-bridge.h only when they need the arch-
      specific things provided there.  Outside of the arch/ directories, the only
      drivers that actually need things provided by asm/pci-bridge.h are the
      powerpc RPA hotplug drivers in drivers/pci/hotplug/rpa*.
      
      Remove the includes of asm/pci-bridge.h from the other drivers, adding an
      include of linux/pci.h if necessary.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      952bbcb0
  9. 11 1月, 2016 1 次提交
  10. 06 10月, 2015 1 次提交
  11. 15 9月, 2015 1 次提交
    • D
      cxl: Fix unbalanced pci_dev_get in cxl_probe · 2925c2fd
      Daniel Axtens 提交于
      Currently the first thing we do in cxl_probe is to grab a reference
      on the pci device. Later on, we call device_register on our adapter.
      In our remove path, we call device_unregister, but we never call
      pci_dev_put. We therefore leak the device every time we do a
      reflash.
      
      device_register/unregister is sufficient to hold the reference.
      Therefore, drop the call to pci_dev_get.
      
      Here's why this is safe.
      The proposed cxl_probe(pdev) calls cxl_adapter_init:
          a) init calls cxl_adapter_alloc, which creates a struct cxl,
             conventionally called adapter. This struct contains a
             device entry, adapter->dev.
      
          b) init calls cxl_configure_adapter, where we set
             adapter->dev.parent = &dev->dev (here dev is the pci dev)
      
      So at this point, the cxl adapter's device's parent is the PCI
      device that I want to be refcounted properly.
      
          c) init calls cxl_register_adapter
             *) cxl_register_adapter calls device_register(&adapter->dev)
      
      So now we're in device_register, where dev is the adapter device, and
      we want to know if the PCI device is safe after we return.
      
      device_register(&adapter->dev) calls device_initialize() and then
      device_add().
      
      device_add() does a get_device(). device_add() also explicitly grabs
      the device's parent, and calls get_device() on it:
      
               parent = get_device(dev->parent);
      
      So therefore, device_register() takes a lock on the parent PCI dev,
      which is what pci_dev_get() was guarding. pci_dev_get() can therefore
      be safely removed.
      
      Fixes: f204e0b8 ("cxl: Driver code for powernv PCIe based cards for userspace access")
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2925c2fd
  12. 30 8月, 2015 1 次提交
  13. 27 8月, 2015 1 次提交
    • D
      cxl: Remove racy attempt to force EEH invocation in reset · 9d8e2767
      Daniel Axtens 提交于
      cxl_reset currently PERSTs the slot, and then repeatedly tries to
      read MMIO space in order to kick off EEH.
      
      There are 2 problems with this: it's unnecessary, and it's racy.
      
      It's unnecessary because the PERST will bring down the PHB link.
      That will be picked up by the CAPP, which will send out an HMI.
      Skiboot, noticing an HMI from the CAPP, will send an OPAL
      notification to the kernel, which will trigger EEH recovery.
      
      It's also racy: the EEH recovery triggered by the CAPP will
      eventually cause the MMIO space to have its mapping invalidated
      and the pointer NULLed out. This races with our attempt to read
      the MMIO space. This is causing OOPSes in testing.
      
      Simply drop all the attempts to force EEH detection, and trust
      that Skiboot will send the notification and that we'll act on it.
      The Skiboot code to send the EEH notification has been in Skiboot
      for as long as CAPP recovery has been supported, so we don't need
      to worry about breaking obscure setups with ancient firmware.
      
      Cc: Ryan Grimm <grimm@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org
      Fixes: 62fa19d4 ("cxl: Add ability to reset the card")
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9d8e2767
  14. 22 8月, 2015 1 次提交
  15. 14 8月, 2015 7 次提交
    • D
      cxl: EEH support · 9e8df8a2
      Daniel Axtens 提交于
      EEH (Enhanced Error Handling) allows a driver to recover from the
      temporary failure of an attached PCI card. Enable basic CXL support
      for EEH.
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9e8df8a2
    • D
      cxl: Allow the kernel to trust that an image won't change on PERST. · 13e68d8b
      Daniel Axtens 提交于
      Provide a kernel API and a sysfs entry which allow a user to specify
      that when a card is PERSTed, it's image will stay the same, allowing
      it to participate in EEH.
      
      cxl_reset is used to reflash the card. In that case, we cannot safely
      assert that the image will not change. Therefore, disallow cxl_reset
      if the flag is set.
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      13e68d8b
    • D
      cxl: Don't remove AFUs/vPHBs in cxl_reset · 4e1efb40
      Daniel Axtens 提交于
      If the driver doesn't participate in EEH, the AFUs will be removed
      by cxl_remove, which will be invoked by EEH.
      
      If the driver does particpate in EEH, the vPHB needs to stick around
      so that the it can particpate.
      
      In both cases, we shouldn't remove the AFU/vPHB.
      Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4e1efb40
    • D
      cxl: Refactor AFU init/teardown · d76427b0
      Daniel Axtens 提交于
      As with an adapter, some aspects of initialisation are done only once
      in the lifetime of an AFU: for example, allocating memory, or setting
      up sysfs/debugfs files.
      
      However, we may want to be able to do some parts of the initialisation
      multiple times: for example, in error recovery we want to be able to
      tear down and then re-map IO memory and IRQs.
      
      Therefore, refactor AFU init/teardown as follows.
      
       - Create two new functions: 'cxl_configure_afu', and its pair
         'cxl_deconfigure_afu'. As with the adapter functions,
         these (de)configure resources that do not need to last the entire
         lifetime of the AFU.
      
       - Allocating and releasing memory remain the task of 'cxl_alloc_afu'
         and 'cxl_release_afu'.
      
       - Once-only functions that do not involve allocating/releasing memory
         stay in the overarching 'cxl_init_afu'/'cxl_remove_afu' pair.
         However, the task of picking an AFU mode and activating it has been
         broken out.
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d76427b0
    • D
      cxl: Refactor adaptor init/teardown · c044c415
      Daniel Axtens 提交于
      Some aspects of initialisation are done only once in the lifetime of
      an adapter: for example, allocating memory for the adapter,
      allocating the adapter number, or setting up sysfs/debugfs files.
      
      However, we may want to be able to do some parts of the
      initialisation multiple times: for example, in error recovery we
      want to be able to tear down and then re-map IO memory and IRQs.
      
      Therefore, refactor CXL init/teardown as follows.
      
       - Keep the overarching functions 'cxl_init_adapter' and its pair,
         'cxl_remove_adapter'.
      
       - Move all 'once only' allocation/freeing steps to the existing
         'cxl_alloc_adapter' function, and its pair 'cxl_release_adapter'
         (This involves moving allocation of the adapter number out of
         cxl_init_adapter.)
      
       - Create two new functions: 'cxl_configure_adapter', and its pair
         'cxl_deconfigure_adapter'. These two functions 'wire up' the
         hardware --- they (de)configure resources that do not need to
         last the entire lifetime of the adapter
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c044c415
    • D
      cxl: Clean up adapter MMIO unmap path. · 575e6986
      Daniel Axtens 提交于
      - MMIO pointer unmapping is guarded by a null pointer check.
         However, iounmap doesn't null the pointer, just invalidate it.
         Therefore, explicitly null the pointer after unmapping.
      
       - afu_desc_mmio also needs to be unmapped.
      
       - PCI regions are allocated in cxl_map_adapter_regs.
         Therefore they should be released in unmap, not elsewhere.
      Acked-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      575e6986
    • D
      cxl: Allocate and release the SPA with the AFU · 05155772
      Daniel Axtens 提交于
      Previously the SPA was allocated and freed upon entering and leaving
      AFU-directed mode. This causes some issues for error recovery - contexts
      hold a pointer inside the SPA, and they may persist after the AFU has
      been detached.
      
      We would ideally like to allocate the SPA when the AFU is allocated, and
      release it until the AFU is released. However, we don't know how big the
      SPA needs to be until we read the AFU descriptor.
      
      Therefore, restructure the code:
      
       - Allocate the SPA only once, on the first attach.
      
       - Release the SPA only when the entire AFU is being released (not
         detached). Guard the release with a NULL check, so we don't free
         if it was never allocated (e.g. dedicated mode)
      Acked-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      05155772
  16. 16 7月, 2015 1 次提交
  17. 13 7月, 2015 1 次提交
  18. 06 7月, 2015 1 次提交
  19. 19 6月, 2015 1 次提交
  20. 03 6月, 2015 4 次提交