1. 06 12月, 2022 3 次提交
    • T
      PCI/MSI: Provide pci_ims_alloc/free_irq() · c9e5bea2
      Thomas Gleixner 提交于
      Single vector allocation which allocates the next free index in the IMS
      space. The free function releases.
      
      All allocated vectors are released also via pci_free_vectors() which is
      also releasing MSI/MSI-X vectors.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKevin Tian <kevin.tian@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20221124232326.961711347@linutronix.de
      c9e5bea2
    • T
      PCI/MSI: Provide IMS (Interrupt Message Store) support · 0194425a
      Thomas Gleixner 提交于
      IMS (Interrupt Message Store) is a new specification which allows
      implementation specific storage of MSI messages contrary to the
      strict standard specified MSI and MSI-X message stores.
      
      This requires new device specific interrupt domains to handle the
      implementation defined storage which can be an array in device memory or
      host/guest memory which is shared with hardware queues.
      
      Add a function to create IMS domains for PCI devices. IMS domains are using
      the new per device domain mechanism and are configured by the device driver
      via a template. IMS domains are created as secondary device domains so they
      work side on side with MSI[-X] on the same device.
      
      The IMS domains have a few constraints:
      
        - The index space is managed by the core code.
      
          Device memory based IMS provides a storage array with a fixed size
          which obviously requires an index. But there is no association between
          index and functionality so the core can randomly allocate an index in
          the array.
      
          System memory based IMS does not have the concept of an index as the
          storage is somewhere in memory. In that case the index is purely
          software based to keep track of the allocations.
      
        - There is no requirement for consecutive index ranges
      
          This is currently a limitation of the MSI core and can be implemented
          if there is a justified use case by changing the internal storage from
          xarray to maple_tree. For now it's single vector allocation.
      
        - The interrupt chip must provide the following callbacks:
      
        	- irq_mask()
      	- irq_unmask()
      	- irq_write_msi_msg()
      
         - The interrupt chip must provide the following optional callbacks
           when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks
           cannot operate directly on hardware, e.g. in the case that the
           interrupt message store is in queue memory:
      
           	- irq_bus_lock()
      	- irq_bus_unlock()
      
           These callbacks are invoked from preemptible task context and are
           allowed to sleep. In this case the mandatory callbacks above just
           store the information. The irq_bus_unlock() callback is supposed to
           make the change effective before returning.
      
         - Interrupt affinity setting is handled by the underlying parent
           interrupt domain and communicated to the IMS domain via
           irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity()
           callback. That's a reasonable restriction similar to the PCI/MSI
           device domain implementations.
      
      The domain is automatically destroyed when the PCI device is removed.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKevin Tian <kevin.tian@intel.com>
      Acked-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20221124232326.904316841@linutronix.de
      0194425a
    • T
      PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X · 34026364
      Thomas Gleixner 提交于
      MSI-X vectors can be allocated after the initial MSI-X enablement, but this
      needs explicit support of the underlying interrupt domains.
      
      Provide a function to query the ability and functions to allocate/free
      individual vectors post-enable.
      
      The allocation can either request a specific index in the MSI-X table or
      with the index argument MSI_ANY_INDEX it allocates the next free vector.
      
      The return value is a struct msi_map which on success contains both index
      and the Linux interrupt number. In case of failure index is negative and
      the Linux interrupt number is 0.
      
      The allocation function is for a single MSI-X index at a time as that's
      sufficient for the most urgent use case VFIO to get rid of the 'disable
      MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts
      and redirections to the legacy and obviously unhandled INTx.
      
      As single index allocation is also sufficient for the use cases Jason
      Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network
      queue. See Link below.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKevin Tian <kevin.tian@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/all/20211126232735.547996838@linutronix.de
      Link: https://lore.kernel.org/r/20221124232326.731233614@linutronix.de
      34026364
  2. 04 12月, 2022 1 次提交
  3. 17 11月, 2022 1 次提交
  4. 15 11月, 2022 1 次提交
  5. 09 11月, 2022 1 次提交
  6. 27 9月, 2022 1 次提交
  7. 13 9月, 2022 2 次提交
  8. 30 7月, 2022 1 次提交
    • A
      PCI: Remove pci_mmap_page_range() wrapper · 0ad722f1
      Arnd Bergmann 提交于
      The ARCH_GENERIC_PCI_MMAP_RESOURCE symbol came up in a recent discussion,
      and I noticed that this was left behind by an unfinished cleanup from 2017.
      
      The only architecture that still relies on providing its own
      pci_mmap_page_range() helper instead of using the generic
      pci_mmap_resource_range() is sparc. Presumably the reasons for this have
      not changed, but at least this can be simplified by converting sparc to use
      the same interface as the others.
      
      The only difference between the two is the device-specific offset that gets
      added to or subtracted from vma->vm_pgoff.
      
      Change the only caller of pci_mmap_page_range() in common code to subtract
      this offset and call the modern interface, while adding it back in the
      sparc implementation to preserve the existing behavior.
      
      This removes the complexities of the dual interfaces from the common code,
      and keeps it all specific to the sparc architecture code. According to
      David Miller, the sparc code lets user space poke into the VGA I/O port
      registers by mmapping the I/O space of the parent bridge device, which is
      something that the generic pci_mmap_resource_range() code apparently does
      not.
      
      Link: https://lore.kernel.org/lkml/1519887203.622.3.camel@infradead.org/t/
      Link: https://lore.kernel.org/lkml/20220714214657.2402250-3-shorne@gmail.com/
      Link: https://lore.kernel.org/r/20220715153617.3393420-1-arnd@kernel.orgSigned-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Stafford Horne <shorne@gmail.com>
      0ad722f1
  9. 06 5月, 2022 1 次提交
  10. 28 4月, 2022 1 次提交
    • L
      bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management · 512881ea
      Lu Baolu 提交于
      The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
      with the device DMA managed by kernel drivers or user-space applications.
      Unfortunately, multiple devices may be placed in the same IOMMU group
      because they cannot be isolated from each other. The DMA on these devices
      must either be entirely under kernel control or userspace control, never
      a mixture. Otherwise the driver integrity is not guaranteed because they
      could access each other through the peer-to-peer accesses which by-pass
      the IOMMU protection.
      
      This checks and sets the default DMA mode during driver binding, and
      cleanups during driver unbinding. In the default mode, the device DMA is
      managed by the device driver which handles DMA operations through the
      kernel DMA APIs (see Documentation/core-api/dma-api.rst).
      
      For cases where the devices are assigned for userspace control through the
      userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
      vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
      default setting in the assumption that the drivers know what they are
      doing with the device DMA.
      
      Calling iommu_device_use_default_domain() before {of,acpi}_dma_configure
      is currently a problem. As things stand, the IOMMU driver ignored the
      initial iommu_probe_device() call when the device was added, since at
      that point it had no fwspec yet. In this situation,
      {of,acpi}_iommu_configure() are retriggering iommu_probe_device() after
      the IOMMU driver has seen the firmware data via .of_xlate to learn that
      it actually responsible for the given device. As the result, before
      that gets fixed, iommu_use_default_domain() goes at the end, and calls
      arch_teardown_dma_ops() if it fails.
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Stuart Yoder <stuyoder@gmail.com>
      Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Link: https://lore.kernel.org/r/20220418005000.897664-5-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
      512881ea
  11. 22 4月, 2022 1 次提交
  12. 30 3月, 2022 1 次提交
  13. 05 3月, 2022 1 次提交
  14. 27 2月, 2022 2 次提交
    • J
      PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF · a7e9f240
      Jason Gunthorpe 提交于
      There are some cases where a SR-IOV VF driver will need to reach into and
      interact with the PF driver. This requires accessing the drvdata of the PF.
      
      Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a
      safe way. Normally accessing a drvdata of a foreign struct device would be
      done using the device_lock() to protect against device driver
      probe()/remove() races.
      
      However, due to the design of pci_enable_sriov() this will result in a
      ABBA deadlock on the device_lock as the PF's device_lock is held during PF
      sriov_configure() while calling pci_enable_sriov() which in turn holds the
      VF's device_lock while calling VF probe(), and similarly for remove.
      
      This means the VF driver can never obtain the PF's device_lock.
      
      Instead use the implicit locking created by pci_enable/disable_sriov(). A
      VF driver can access its PF drvdata only while its own driver is attached,
      and the PF driver can control access to its own drvdata based on when it
      calls pci_enable/disable_sriov().
      
      To use this API the PF driver will setup the PF drvdata in the probe()
      function. pci_enable_sriov() is only called from sriov_configure() which
      cannot happen until probe() completes, ensuring no VF races with drvdata
      setup.
      
      For removal, the PF driver must call pci_disable_sriov() in its remove
      function before destroying any of the drvdata. This ensures that all VF
      drivers are unbound before returning, fencing concurrent access to the
      drvdata.
      
      The introduction of a new function to do this access makes clear the
      special locking scheme and the documents the requirements on the PF/VF
      drivers using this.
      
      Link: https://lore.kernel.org/all/20220224142024.147653-5-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NYishai Hadas <yishaih@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      a7e9f240
    • J
      PCI/IOV: Add pci_iov_vf_id() to get VF index · 21ca9fb6
      Jason Gunthorpe 提交于
      The PCI core uses the VF index internally, often called the vf_id,
      during the setup of the VF, eg pci_iov_add_virtfn().
      
      This index is needed for device drivers that implement live migration
      for their internal operations that configure/control their VFs.
      
      Specifically, mlx5_vfio_pci driver that is introduced in coming patches
      from this series needs it and not the bus/device/function which is
      exposed today.
      
      Add pci_iov_vf_id() which computes the vf_id by reversing the math that
      was used to create the bus/device/function.
      
      Link: https://lore.kernel.org/all/20220224142024.147653-2-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NYishai Hadas <yishaih@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      21ca9fb6
  15. 11 1月, 2022 1 次提交
  16. 18 12月, 2021 1 次提交
  17. 17 12月, 2021 2 次提交
  18. 09 12月, 2021 2 次提交
  19. 19 11月, 2021 1 次提交
  20. 12 11月, 2021 1 次提交
  21. 11 11月, 2021 1 次提交
  22. 08 11月, 2021 1 次提交
  23. 30 10月, 2021 1 次提交
  24. 18 10月, 2021 2 次提交
  25. 13 10月, 2021 1 次提交
    • B
      PCI: Return NULL for to_pci_driver(NULL) · 8e9028b3
      Bjorn Helgaas 提交于
      to_pci_driver() takes a pointer to a struct device_driver and uses
      container_of() to find the struct pci_driver that contains it.
      
      If given a NULL pointer to a struct device_driver, return a NULL pci_driver
      pointer instead of applying container_of() to NULL.
      
      This simplifies callers that would otherwise have to check for a NULL
      pointer first.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      8e9028b3
  26. 12 10月, 2021 1 次提交
  27. 22 9月, 2021 1 次提交
  28. 01 9月, 2021 4 次提交
  29. 27 8月, 2021 2 次提交
    • Z
      PCI: Allow PASID on fake PCIe devices without TLP prefixes · 8c09e896
      Zhangfei Gao 提交于
      Some systems, e.g., HiSilicon KunPeng920 and KunPeng930, have devices that
      appear as PCI but are actually on the AMBA bus.  Some of these fake PCI
      devices support a PASID-like feature and they do have a working PASID
      capability even though they do not use the PCIe Transport Layer Protocol
      and do not support TLP prefixes.
      
      Add a pasid_no_tlp bit for this "PASID works without TLP prefixes" case and
      update pci_enable_pasid() so it can enable PASID on these devices.
      
      Set this bit for HiSilicon KunPeng920 and KunPeng930.
      
      [bhelgaas: squashed, commit log]
      Suggested-by: NBjorn Helgaas <bhelgaas@google.com>
      Link: https://lore.kernel.org/r/1626144876-11352-2-git-send-email-zhangfei.gao@linaro.org
      Link: https://lore.kernel.org/r/1626144876-11352-3-git-send-email-zhangfei.gao@linaro.orgSigned-off-by: NZhangfei Gao <zhangfei.gao@linaro.org>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
      Signed-off-by: NZhou Wang <wangzhou1@hisilicon.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      8c09e896
    • M
      PCI / VFIO: Add 'override_only' support for VFIO PCI sub system · cc6711b0
      Max Gurtovoy 提交于
      Expose an 'override_only' helper macro (i.e.
      PCI_DRIVER_OVERRIDE_DEVICE_VFIO) for VFIO PCI sub system and add the
      required code to prefix its matching entries with "vfio_" in
      modules.alias file.
      
      It allows VFIO device drivers to include match entries in the
      modules.alias file produced by kbuild that are not used for normal
      driver autoprobing and module autoloading. Drivers using these match
      entries can be connected to the PCI device manually, by userspace, using
      the existing driver_override sysfs.
      
      For example the resulting modules.alias may have:
      
        alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core
        alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
        alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
      
      In this example mlx5_core and mlx5_vfio_pci match to the same PCI
      device. The kernel will autoload and autobind to mlx5_core but the
      kernel and udev mechanisms will ignore mlx5_vfio_pci.
      
      When userspace wants to change a device to the VFIO subsystem it can
      implement a generic algorithm:
      
         1) Identify the sysfs path to the device:
          /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
      
         2) Get the modalias string from the kernel:
          $ cat /sys/bus/pci/devices/0000:01:00.0/modalias
          pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
      
         3) Prefix it with vfio_:
          vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00
      
         4) Search modules.alias for the above string and select the entry that
            has the fewest *'s:
          alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci
      
         5) modprobe the matched module name:
          $ modprobe mlx5_vfio_pci
      
         6) cat the matched module name to driver_override:
          echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override
      
         7) unbind device from original module
           echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
      
         8) probe PCI drivers (or explicitly bind to mlx5_vfio_pci)
          echo 0000:01:00.0 > /sys/bus/pci/drivers_probe
      
      The algorithm is independent of bus type. In future the other buses with
      VFIO device drivers, like platform and ACPI, can use this algorithm as
      well.
      
      This patch is the infrastructure to provide the information in the
      modules.alias to userspace. Convert the only VFIO pci_driver which results
      in one new line in the modules.alias:
      
        alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci
      
      Later series introduce additional HW specific VFIO PCI drivers, such as
      mlx5_vfio_pci.
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # for pci.h
      Signed-off-by: NYishai Hadas <yishaih@nvidia.com>
      Link: https://lore.kernel.org/r/20210826103912.128972-11-yishaih@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
      cc6711b0