1. 14 3月, 2017 1 次提交
  2. 01 3月, 2017 1 次提交
  3. 23 11月, 2016 1 次提交
    • D
      spapr: Fix 2.7<->2.8 migration of PCI host bridge · 5c4537bd
      David Gibson 提交于
      daa23699 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration
      from qemu-2.7 to the current version.  It split the device's MMIO
      window into two pieces for 32-bit and 64-bit MMIO.
      
      The patch included backwards compatibility code to convert the old
      property into the new format.  However, the property value was also
      transferred in the migration stream and compared with a (probably
      unwise) VMSTATE_EQUAL.  So, the "raw" value from 2.7 is compared to
      the new style converted value from (pre-)2.8 giving a mismatch and
      migration failure.
      
      Along with the actual field that caused the breakage, there are
      several other ill-advised VMSTATE_EQUAL()s.  To fix forwards
      migration, we read the values in the stream into scratch variables and
      ignore them, instead of comparing for equality.  To fix backwards
      migration, we populate those scratch variables in pre_save() with
      adjusted values to match the old behaviour.
      
      To permit the eventual possibility of removing this cruft from the
      stream, we only include these compatibility fields if a new
      'pre-2.8-migration' property is set.  We clear it on the pseries-2.8
      machine type, which obviously can't be migrated backwards, but set it
      on earlier machine type versions.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: NThomas Huth <thuth@redhat.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      5c4537bd
  4. 16 10月, 2016 3 次提交
    • D
      spapr: Improved placement of PCI host bridges in guest memory map · 357d1e3b
      David Gibson 提交于
      Currently, the MMIO space for accessing PCI on pseries guests begins at
      1 TiB in guest address space.  Each PCI host bridge (PHB) has a 64 GiB
      chunk of address space in which it places its outbound PIO and 32-bit and
      64-bit MMIO windows.
      
      This scheme as several problems:
        - It limits guest RAM to 1 TiB (though we have a limited fix for this
          now)
        - It limits the total MMIO window to 64 GiB.  This is not always enough
          for some of the large nVidia GPGPU cards
        - Putting all the windows into a single 64 GiB area means that naturally
          aligning things within there will waste more address space.
      In addition there was a miscalculation in some of the defaults, which meant
      that the MMIO windows for each PHB actually slightly overran the 64 GiB
      region for that PHB.  We got away without nasty consequences because
      the overrun fit within an unused area at the beginning of the next PHB's
      region, but it's not pretty.
      
      This patch implements a new scheme which addresses those problems, and is
      also closer to what bare metal hardware and pHyp guests generally use.
      
      Because some guest versions (including most current distro kernels) can't
      access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
      64 TiB.  This is broken into 1 TiB chunks.  The first 1 TiB contains the
      PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs.  Each
      subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
      one PHB each.
      
      This reduces the number of allowed PHBs (without full manual configuration
      of all the windows) from 256 to 31, but this should still be plenty in
      practice.
      
      We also change some of the default window sizes for manually configured
      PHBs to saner values.
      
      Finally we adjust some tests and libqos so that it correctly uses the new
      default locations.  Ideally it would parse the device tree given to the
      guest, but that's a more complex problem for another time.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
      357d1e3b
    • D
      spapr_pci: Add a 64-bit MMIO window · daa23699
      David Gibson 提交于
      On real hardware, and under pHyp, the PCI host bridges on Power machines
      typically advertise two outbound MMIO windows from the guest's physical
      memory space to PCI memory space:
        - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
        - A 64-bit window which maps onto a large region somewhere high in PCI
          address space (traditionally this used an identity mapping from guest
          physical address to PCI address, but that's not always the case)
      
      The qemu implementation in spapr-pci-host-bridge, however, only supports a
      single outbound MMIO window, however.  At least some Linux versions expect
      the two windows however, so we arranged this window to map onto the PCI
      memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
      windows, the "32-bit" window from 2G..4G and the "64-bit" window from
      4G..~64G.
      
      This approach means, however, that the 64G window is not naturally aligned.
      In turn this limits the size of the largest BAR we can map (which does have
      to be naturally aligned) to roughly half of the total window.  With some
      large nVidia GPGPU cards which have huge memory BARs, this is starting to
      be a problem.
      
      This patch adds true support for separate 32-bit and 64-bit outbound MMIO
      windows to the spapr-pci-host-bridge implementation, each of which can
      be independently configured.  The 32-bit window always maps to 2G.. in PCI
      space, but the PCI address of the 64-bit window can be configured (it
      defaults to the same as the guest physical address).
      
      So as not to break possible existing configurations, as long as a 64-bit
      window is not specified, a large single window can be specified.  This
      will appear the same way to the guest as the old approach, although it's
      now implemented by two contiguous memory regions rather than a single one.
      
      For now, this only adds the possibility of 64-bit windows.  The default
      configuration still uses the legacy mode.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
      daa23699
    • D
      spapr_pci: Delegate placement of PCI host bridges to machine type · 6737d9ad
      David Gibson 提交于
      The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
      for a PAPR guest.  Unlike on x86, it's routine on Power (both bare metal
      and PAPR guests) to have numerous independent PHBs, each controlling a
      separate PCI domain.
      
      There are two ways of configuring the spapr-pci-host-bridge device: first
      it can be done fully manually, specifying the locations and sizes of all
      the IO windows.  This gives the most control, but is very awkward with 6
      mandatory parameters.  Alternatively just an "index" can be specified
      which essentially selects from an array of predefined PHB locations.
      The PHB at index 0 is automatically created as the default PHB.
      
      The current set of default locations causes some problems for guests with
      large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
      GPGPU cards via VFIO).  Obviously, for migration we can only change the
      locations on a new machine type, however.
      
      This is awkward, because the placement is currently decided within the
      spapr-pci-host-bridge code, so it breaks abstraction to look inside the
      machine type version.
      
      So, this patch delegates the "default mode" PHB placement from the
      spapr-pci-host-bridge device back to the machine type via a public method
      in sPAPRMachineClass.  It's still a bit ugly, but it's about the best we
      can do.
      
      For now, this just changes where the calculation is done.  It doesn't
      change the actual location of the host bridges, or any other behaviour.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
      6737d9ad
  5. 23 9月, 2016 1 次提交
  6. 15 9月, 2016 1 次提交
  7. 12 7月, 2016 2 次提交
  8. 05 7月, 2016 1 次提交
    • A
      spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) · ae4de14c
      Alexey Kardashevskiy 提交于
      This adds support for Dynamic DMA Windows (DDW) option defined by
      the SPAPR specification which allows to have additional DMA window(s)
      
      The "ddw" property is enabled by default on a PHB but for compatibility
      the pseries-2.6 machine and older disable it.
      This also creates a single DMA window for the older machines to
      maintain backward migration.
      
      This implements DDW for PHB with emulated and VFIO devices. The host
      kernel support is required. The advertised IOMMU page sizes are 4K and
      64K; 16M pages are supported but not advertised by default, in order to
      enable them, the user has to specify "pgsz" property for PHB and
      enable huge pages for RAM.
      
      The existing linux guests try creating one additional huge DMA window
      with 64K or 16MB pages and map the entire guest RAM to. If succeeded,
      the guest switches to dma_direct_ops and never calls TCE hypercalls
      (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM
      and not waste time on map/unmap later. This adds a "dma64_win_addr"
      property which is a bus address for the 64bit window and by default
      set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware
      uses and this allows having emulated and VFIO devices on the same bus.
      
      This adds 4 RTAS handlers:
      * ibm,query-pe-dma-window
      * ibm,create-pe-dma-window
      * ibm,remove-pe-dma-window
      * ibm,reset-pe-dma-window
      These are registered from type_init() callback.
      
      These RTAS handlers are implemented in a separate file to avoid polluting
      spapr_iommu.c with PCI.
      
      This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs
      and updates all references to dma_liobn. However this does not add
      64bit LIOBN to the migration stream as in fact even 32bit LIOBN is
      rather pointless there (as it is a PHB property and the management
      software can/should pass LIOBNs via CLI) but we keep it for the backward
      migration support.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      ae4de14c
  9. 01 7月, 2016 1 次提交
  10. 07 6月, 2016 1 次提交
  11. 16 3月, 2016 4 次提交
    • D
      spapr_pci: Remove finish_realize hook · a36304fd
      David Gibson 提交于
      Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
      only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
      we can fold that implementation into its (single) caller, and remove the
      hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
      well.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      a36304fd
    • D
      spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge · 72700d7e
      David Gibson 提交于
      Now that the regular spapr-pci-host-bridge can handle EEH, there are only
      two things that spapr-pci-vfio-host-bridge does differently:
          1. automatically sizes its DMA window to match the host IOMMU
          2. checks if the attached VFIO container is backed by the
             VFIO_SPAPR_TCE_IOMMU type on the host
      
      (1) is not particularly useful, since the default window used by the
      regular host bridge will work with the host IOMMU configuration on all
      current systems anyway.
      
      Plus, automatically changing guest visible configuration (such as the DMA
      window) based on host settings is generally a bad idea.  It's not
      definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
      support VFIO devices which can't be migrated anyway, but still.
      
      (2) is not really useful, because if a guest tries to configure EEH on a
      different host IOMMU, the first call will fail and that will be that.
      
      It's possible there are scripts or tools out there which expect
      spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
      reduces it to just a stub for backwards compatibility.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      72700d7e
    • D
      spapr_pci: Allow EEH on spapr-pci-host-bridge · c1fa017c
      David Gibson 提交于
      Now that the EEH code is independent of the special
      spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
      host bridges instead.  We do this by changing spapr_phb_eeh_available()
      to be based on the vfio_eeh_as_ok() call instead of the host bridge class.
      
      Because the value of vfio_eeh_as_ok() can change with devices being
      hotplugged or unplugged, this can potentially lead to some strange edge
      cases where the guest starts using EEH, then it starts failing because
      of a change in status.
      
      However, it's not really any worse than the current situation.  Cases that
      would have worked previously will still work (i.e. VFIO devices from at
      most one VFIO IOMMU group per vPHB), it's just that it's no longer
      necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      c1fa017c
    • D
      spapr_pci: Eliminate class callbacks · fbb4e983
      David Gibson 提交于
      The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
      special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
      the class specific callbacks with direct calls based on a simple
      spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
      a boolean in the class, but we'll continue to clean that up later.
      
      On its own this is a rather strange way of doing things, but it's a useful
      intermediate step to further cleanups.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      fbb4e983
  12. 23 10月, 2015 1 次提交
  13. 07 7月, 2015 1 次提交
    • D
      spapr: Merge sPAPREnvironment into sPAPRMachineState · 28e02042
      David Gibson 提交于
      The code for -machine pseries maintains a global sPAPREnvironment structure
      which keeps track of general state information about the guest platform.
      This predates the existence of the MachineState structure, but performs
      basically the same function.
      
      Now that we have the generic MachineState, fold sPAPREnvironment into
      sPAPRMachineState, the pseries specific subclass of MachineState.
      
      This is mostly a matter of search and replace, although a few places which
      relied on the global spapr variable are changed to find the structure via
      qdev_get_machine().
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      28e02042
  14. 04 6月, 2015 3 次提交
  15. 09 3月, 2015 3 次提交
    • G
      sPAPR: Implement EEH RTAS calls · ee954280
      Gavin Shan 提交于
      The emulation for EEH RTAS requests from guest isn't covered
      by QEMU yet and the patch implements them.
      
      The patch defines constants used by EEH RTAS calls and adds
      callbacks sPAPRPHBClass::{eeh_set_option, eeh_get_state, eeh_reset,
      eeh_configure}, which are going to be used as follows:
      
        * RTAS calls are received in spapr_pci.c, sanity check is done
          there.
        * RTAS handlers handle what they can. If there is something it
          cannot handle and the corresponding sPAPRPHBClass callback is
          defined, it is called.
        * Those callbacks are only implemented for VFIO now. They do ioctl()
          to the IOMMU container fd to complete the calls. Error codes from
          that ioctl() are transferred back to the guest.
      
      [aik: defined RTAS tokens for EEH RTAS calls]
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ee954280
    • A
      spapr-pci: Enable huge BARs · b194df47
      Alexey Kardashevskiy 提交于
      At the moment sPAPR only supports 512MB window for MMIO BARs. However
      modern devices might want bigger 64bit BARs.
      
      This extends MMIO window from 512MB to 62GB (aligned to
      SPAPR_PCI_WINDOW_SPACING) and advertises it in 2 records in
      the PHB "ranges" property. 32bit gets the space from
      SPAPR_PCI_MEM_WIN_BUS_OFFSET till the end of 4GB, 64bit gets the rest
      of the space. If no space is left, 64bit range is not advertised.
      
      The MMIO space size is set to old value of 0x20000000 by default
      for pseries machines older than 2.3.
      
      The approach changes the device tree which is a guest visible change, however
      it won't break migration as:
      1. we do not support migration to older QEMU versions
      2. migration to newer QEMU will migrate the device tree as well and since
      the new layout only extends the old one and does not change address mappigns,
      no breakage is expected here too.
      
      SLOF change is required to utilize this extension.
      Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b194df47
    • D
      pseries: Limit PCI host bridge "index" value · 3e4ac968
      David Gibson 提交于
      pseries guests can have large numbers of PCI host bridges.  To avoid the
      user having to specify a number of different configuration values for every
      one, the device supports an "index" property which is a shorthand setting
      the various window and configuration addresses from a predefined sensible
      set.
      
      There are some problems with the details at present:
        * The "index" propery is signed, but negative values will create PCI
      windows below where we expect, potentially colliding with other devices
        * No limit is imposed on the "index" property and large values can
      translate to extremely large window addresses.  With PCI passthrough in
      particular this can mean we exceed various mapping and physical address
      limits causing the guest host bridge to not work in strange ways.
      
      This patch addresses this, by making "index" unsigned, and imposing a
      limit.  Currently the limit allows indices from 0..255 which is probably
      enough host bridges for the time being.  It's fairly easy to extend if
      we discover we need more.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3e4ac968
  16. 08 9月, 2014 1 次提交
    • G
      spapr_pci: map the MSI window in each PHB · 8c46f7ec
      Greg Kurz 提交于
      On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
      Commit cc943c36 has modified MSI-X
      so that writes are made using the bus master address space and follow
      the IOMMU path.
      
      Unfortunately, the IOMMU address space address space does not have an
      MSI window: the notification is silently dropped in unassigned_mem_write
      instead of reaching the guest... The most visible effect is that all
      virtio devices are non-functional on sPAPR since then. :(
      
      This patch does the following:
      1) map the MSI window into the IOMMU address space for each PHB
         - since each PHB instantiates its own IOMMU address space, we
           can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
         - no real need to keep the MSI window setup in a separate function,
           the spapr_pci_msi_init() code moves to spapr_phb_realize().
      
      2) kill the global MSI window as it is not needed in the end
      Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8c46f7ec
  17. 27 6月, 2014 2 次提交
    • A
      spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB · 9a321e92
      Alexey Kardashevskiy 提交于
      Currently SPAPR PHB keeps track of all allocated MSI (here and below
      MSI stands for both MSI and MSIX) interrupt because
      XICS used to be unable to reuse interrupts. This is a problem for
      dynamic MSI reconfiguration which happens when guest reloads a driver
      or performs PCI hotplug. Another problem is that the existing
      implementation can enable MSI on 32 devices maximum
      (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.
      
      This makes use of new XICS ability to reuse interrupts.
      
      This reorganizes MSI information storage in sPAPRPHBState. Instead of
      static array of 32 descriptors (one per a PCI function), this patch adds
      a GHashTable when @config_addr is a key and (first_irq, num) pair is
      a value. GHashTable can dynamically grow and shrink so the initial limit
      of 32 devices is gone.
      
      This changes migration stream as @msi_table was a static array while new
      @msi_devs is a dynamic hash table. This adds temporary array which is
      used for migration, it is populated in "spapr_pci"::pre_save() callback
      and expanded into the hash table in post_load() callback. Since
      the destination side does not know the number of MSI-enabled devices
      in advance and cannot pre-allocate the temporary array to receive
      migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
      which allocates the array automatically.
      
      This resets the MSI configuration space when interrupts are released by
      the ibm,change-msi RTAS call.
      
      This fixed traces to be more informative.
      
      This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
      was incorrect by accident. As the internal representation changed,
      thus bumps migration version number.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      [agraf: drop g_malloc_n usage]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9a321e92
    • A
      spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio · 9fc34ada
      Alexey Kardashevskiy 提交于
      The patch adds a spapr-pci-vfio-host-bridge device type
      which is a PCI Host Bridge with VFIO support. The new device
      inherits from the spapr-pci-host-bridge device and adds an "iommu"
      property which is an IOMMU id. This ID represents a minimal entity
      for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
      group is called a Partitionable Endpoint (PE).
      
      Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
      SPAPR allows multiple PHB for no extra cost, this does not seem to
      be a problem. This limitation may change in the future though.
      
      Example of use:
      Configure and Add 3 functions of a multifunctional device to QEMU:
      (the NEC PCI USB card is used as an example here):
      -device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
      -device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
      -device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
      -device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB
      
      where:
      * index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
      offset);
      * iommu=4 is an IOMMU id which can be found in sysfs:
      [aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
      [aik@vpl2 0004:00:00.0]$ ls -l iommu_group
      lrwxrwxrwx 1 root root 0 Jun  5 12:49 iommu_group -> ../../../kernel/iommu_groups/4
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9fc34ada
  18. 16 6月, 2014 4 次提交
    • A
      spapr_iommu: Get rid of window_size in sPAPRTCETable · 523e7b8a
      Alexey Kardashevskiy 提交于
      This removes window_size as it is basically a copy of nb_table
      shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are
      going to support windows as big as the entire RAM and this number
      will be bigger that 32 capacity, we will have to do something
      about @window_size anyway and removal seems to be the right way to go.
      
      This removes dma_window_start/dma_window_size from sPAPRPHBState as
      they are no longer used.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      523e7b8a
    • A
      spapr_pci: Allow multiple TCE tables per PHB · e28c16f6
      Alexey Kardashevskiy 提交于
      At the moment sPAPRPHBState contains a @tcet pointer to the only
      TCE table. However sPAPR spec allows having more than one DMA window.
      
      Since the TCE object is already a child of SPAPR PHB object, there is
      no need to keep an additional pointer to it in sPAPRPHBState so remove it.
      
      This changes the way sPAPRPHBState::reset performs reset of sPAPRTCETable
      objects.
      
      This changes the default DMA window properties calculation.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e28c16f6
    • A
      spapr_pci: spapr_iommu: Make DMA window a subregion · cca7fad5
      Alexey Kardashevskiy 提交于
      Currently the default DMA window is represented by a single MemoryRegion.
      However there can be more than just one window so we need
      a "root" memory region to be separated from the actual DMA window(s).
      
      This introduces a "root" IOMMU memory region and adds a subregion for
      the default DMA 32bit window. Following patches will add other
      subregion(s).
      
      This initializes a default DMA window subregion size to the guest RAM
      size as this window can be switched into "bypass" mode which implements
      direct DMA mapping.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      cca7fad5
    • A
      spapr_pci: Introduce a finish_realize() callback · da6ccee4
      Alexey Kardashevskiy 提交于
      The spapr-pci PHB initializes IOMMU for emulated devices only.
      The upcoming VFIO support will do it different. However both emulated
      and VFIO PHB types share most of the initialization code.
      For the type specific things a new finish_realize() callback is
      introduced.
      
      This introduces sPAPRPHBClass derived from PCIHostBridgeClass and
      adds the callback pointer.
      
      This implements finish_realize() for emulated devices.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      [agraf: Fix compilation]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      da6ccee4
  19. 02 9月, 2013 1 次提交
    • A
      spapr-pci: rework MSI/MSIX · f1c2dc7c
      Alexey Kardashevskiy 提交于
      On the sPAPR platform a guest allocates MSI/MSIX vectors via RTAS
      hypercalls which return global IRQ numbers to a guest so it only
      operates with those and never touches MSIMessage.
      
      Therefore MSIMessage handling is completely hidden in QEMU.
      
      Previously every sPAPR PCI host bridge implemented its own MSI window
      to catch msi_notify()/msix_notify() calls from QEMU devices (virtio-pci
      or vfio) and route them to the guest via qemu_pulse_irq().
      MSIMessage used to be encoded as:
      	.addr - address within the PHB MSI window;
      	.data - the device index on PHB plus vector number.
      The MSI MR write function translated this MSIMessage to a global IRQ
      number and called qemu_pulse_irq().
      
      However the total number of IRQs is not really big (at the moment it is
      1024 IRQs starting from 4096) and even 16bit data field of MSIMessage
      seems to be enough to store an IRQ number there.
      
      This simplifies MSI handling in sPAPR PHB. Specifically, this does:
      1. remove a MSI window from a PHB;
      2. add a single memory region for all MSIs to sPAPREnvironment
      and spapr_pci_msi_init() to initialize it;
      3. encode MSIMessage as:
          * .addr - a fixed address of SPAPR_PCI_MSI_WINDOW==0x40000000000ULL;
          * .data as an IRQ number.
      4. change IRQ allocator to align first IRQ number in a block for MSI.
      MSI uses lower bits to specify the vector number so the first IRQ has to
      be aligned. MSIX does not need any special allocator though.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NAnthony Liguori <aliguori@us.ibm.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f1c2dc7c
  20. 29 7月, 2013 1 次提交
  21. 20 6月, 2013 2 次提交
  22. 09 4月, 2013 1 次提交
    • P
      hw: move headers to include/ · 0d09e41a
      Paolo Bonzini 提交于
      Many of these should be cleaned up with proper qdev-/QOM-ification.
      Right now there are many catch-all headers in include/hw/ARCH depending
      on cpu.h, and this makes it necessary to compile these files per-target.
      However, fixing this does not belong in these patches.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0d09e41a
  23. 22 3月, 2013 1 次提交
    • D
      pseries: Remove "busname" property for PCI host bridge · 89dfd6e1
      David Gibson 提交于
      Currently the "spapr-pci-host-bridge" device has a "busname" property which
      can be used to override the default assignment of qbus names for the bus
      subordinate to the PHB.  We use that for the default primary PCI bus, to
      make libvirt happy, which expects there to be a bus named simply "pci".
      The default qdev core logic would name the bus "pci.0", and the pseries
      code would otherwise name it "pci@800000020000000" which is the name it
      is given in the device tree based on its BUID.
      
      The "busname" property is rather clunky though, so this patch simplifies
      things by just using a special case hack for the default PHB, setting
      busname to "pci" when index=0.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      89dfd6e1
  24. 26 1月, 2013 1 次提交
    • D
      pseries: Improve handling of multiple PCI host bridges · caae58cb
      David Gibson 提交于
      Multiple - even many - PCI host bridges (i.e. PCI domains) are very
      common on real PAPR compliant hardware.  For reasons related to the
      PAPR specified IOMMU interfaces, PCI device assignment with VFIO will
      generally require at least two (virtual) PHBs and possibly more
      depending on which devices are assigned.
      
      At the moment the qemu PAPR PCI code will not deal with this well,
      leaving several crucial parameters of PHBs other than the default one
      uninitialized.  This patch reworks the code to allow this.
      
      Every PHB needs a unique BUID (Bus Unit Identifier, the id used for
      the PAPR PCI related interfaces) and a unique LIOBN (Logical IO Bus
      Number, the id used for the PAPR IOMMU related interfaces).  In
      addition they need windows in CPU real address space to access PCI
      memory space, PCI IO space and MSIs.  Properties are added to the PCI
      host bridge qdevice to allow configuration of all these.
      
      To simplify configuration of multiple PHBs for common cases, a
      convenience "index" property is also added.  This can be set instead
      of the low-level properties, and will generate suitable values for the
      other parameters, different for each index value.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      caae58cb
  25. 17 12月, 2012 1 次提交