1. 23 9月, 2015 5 次提交
  2. 07 7月, 2015 5 次提交
  3. 04 6月, 2015 10 次提交
  4. 09 3月, 2015 7 次提交
    • G
      sPAPR: Implement EEH RTAS calls · ee954280
      Gavin Shan 提交于
      The emulation for EEH RTAS requests from guest isn't covered
      by QEMU yet and the patch implements them.
      
      The patch defines constants used by EEH RTAS calls and adds
      callbacks sPAPRPHBClass::{eeh_set_option, eeh_get_state, eeh_reset,
      eeh_configure}, which are going to be used as follows:
      
        * RTAS calls are received in spapr_pci.c, sanity check is done
          there.
        * RTAS handlers handle what they can. If there is something it
          cannot handle and the corresponding sPAPRPHBClass callback is
          defined, it is called.
        * Those callbacks are only implemented for VFIO now. They do ioctl()
          to the IOMMU container fd to complete the calls. Error codes from
          that ioctl() are transferred back to the guest.
      
      [aik: defined RTAS tokens for EEH RTAS calls]
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ee954280
    • D
      pseries: Switch VGA endian on H_SET_MODE · eefaccc0
      David Gibson 提交于
      When the guest switches the interrupt endian mode, which essentially
      means a global machine endian switch, we want to change the VGA
      framebuffer endian mode as well in order to be backward compatible
      with existing guests who don't know about the new endian control
      register.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      eefaccc0
    • D
      pseries: Move rtc_offset into RTC device's state structure · 880ae7de
      David Gibson 提交于
      The initial creation of the PAPR RTC qdev class left a wart - the rtc's
      offset was left in the sPAPREnvironment structure, accessed via a global.
      
      This patch moves it into the RTC device's own state structure, were it
      belongs.  This requires a small change to the migration stream format.  In
      order to handle incoming streams from older versions, we also need to
      retain the rtc_offset field in the sPAPREnvironment structure, so that it
      can be loaded into via the vmsd, then pushed into the RTC device.
      
      Since we're changing the migration format, this also takes the opportunity
      to:
      
        * Change the rtc offset from a value in seconds to a value in
          nanoseconds, allowing nanosecond offsets between host and guest
          rtc time, if desired.
      
        * Remove both the already unused "next_irq" field and now unused
          "rtc_offset" field from the new version of the spapr migration
          stream
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      880ae7de
    • D
      pseries: Make the PAPR RTC a qdev device · 28df36a1
      David Gibson 提交于
      At present the PAPR RTC isn't a "device" as such - it's accessed only via
      firmware/hypervisor calls, and is handled in the sPAPR core code.  This
      becomes inconvenient as we extend it in various ways.
      
      This patch makes the PAPR RTC a separate device in the qemu device model.
      
      For now, the only piece of device state - the rtc_offset - is still kept in
      the global sPAPREnvironment structure.  That's clearly wrong, but leaving
      it to be fixed in a following patch makes for a clearer separation between
      the internal re-organization of the device, and the behavioural changes
      (because the migration stream format needs to change slightly when the
      offset is moved into the device's own state).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      28df36a1
    • D
      pseries: Add spapr_rtc_read() helper function · e5dad1d7
      David Gibson 提交于
      The virtual RTC time is used in two places in the pseries machine.  First
      is in the RTAS get-time-of-day function which returns the RTC time to the
      guest.  Second is in the spapr events code which is used to timestamp
      event messages from the hypervisor to the guest.
      
      Currently both call qemu_get_timedate() directly, but we want to change
      that so we can properly handle the various -rtc options.  In preparation,
      create a helper function to return the virtual RTC time.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e5dad1d7
    • D
      pseries: Move sPAPR RTC code into its own file · 12f42174
      David Gibson 提交于
      At the moment the RTAS (firmware/hypervisor) time of day functions are
      implemented in spapr_rtas.c along with a bunch of other things.  Since
      we're going to be expanding these a bit, move the RTAS RTC related code
      out into new file spapr_rtc.c.  Also add its own initialization function,
      spapr_rtc_init() called from the main machine init routine.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      12f42174
    • A
      spapr_vio/spapr_iommu: Move VIO bypass where it belongs · ee9a569a
      Alexey Kardashevskiy 提交于
      Instead of tweaking a TCE table device by adding there a bypass flag,
      let's add an alias to RAM and IOMMU memory region, and enable/disable
      those according to the selected bypass mode.
      This way IOMMU memory region can have size of the actual window rather
      than ram_size which is essential for upcoming DDW support.
      
      This moves bypass logic to VIO layer and keeps @bypass flag in TCE table
      for migration compatibility only. This replaces spapr_tce_set_bypass()
      calls with explicit assignment to avoid confusion as the function could
      do something more that just syncing the @bypass flag.
      
      This adds a pointer to VIO device into the sPAPRTCETable struct to provide
      the sPAPRTCETable device a way to update bypass mode for the VIO device.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ee9a569a
  5. 07 1月, 2015 1 次提交
  6. 08 9月, 2014 3 次提交
    • G
      spapr_pci: map the MSI window in each PHB · 8c46f7ec
      Greg Kurz 提交于
      On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
      Commit cc943c36 has modified MSI-X
      so that writes are made using the bus master address space and follow
      the IOMMU path.
      
      Unfortunately, the IOMMU address space address space does not have an
      MSI window: the notification is silently dropped in unassigned_mem_write
      instead of reaching the guest... The most visible effect is that all
      virtio devices are non-functional on sPAPR since then. :(
      
      This patch does the following:
      1) map the MSI window into the IOMMU address space for each PHB
         - since each PHB instantiates its own IOMMU address space, we
           can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
         - no real need to keep the MSI window setup in a separate function,
           the spapr_pci_msi_init() code moves to spapr_phb_realize().
      
      2) kill the global MSI window as it is not needed in the end
      Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8c46f7ec
    • B
      spapr: Locate RTAS and device-tree based on real RMA · b7d1f77a
      Benjamin Herrenschmidt 提交于
      We currently calculate the final RTAS and FDT location based on
      the early estimate of the RMA size, cropped to 256M on KVM since
      we only know the real RMA size at reset time which happens much
      later in the boot process.
      
      This means the FDT and RTAS end up right below 256M while they
      could be much higher, using precious RMA space and limiting
      what the OS bootloader can put there which has proved to be
      a problem with some OSes (such as when using very large initrd's)
      
      Fortunately, we do the actual copy of the device-tree into guest
      memory much later, during reset, late enough to be able to do it
      using the final RMA value, we just need to move the calculation
      to the right place.
      
      However, RTAS is still loaded too early, so we change the code to
      load the tiny blob into qemu memory early on, and then copy it into
      guest memory at reset time. It's small enough that the memory usage
      doesn't matter.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR]
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      [agraf: fix compilation on 32bit hosts]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b7d1f77a
    • N
      ppc: spapr-rtas - implement os-term rtas call · 2e14072f
      Nikunj A Dadhania 提交于
      PAPR compliant guest calls this in absence of kdump. This finally
      reaches the guest and can be handled according to the policies set by
      higher level tools(like taking dump) for further analysis by tools like
      crash.
      
      Linux kernel calls ibm,os-term when extended property of os-term is set.
      This makes sure that a return to the linux kernel is gauranteed.
      Signed-off-by: NNikunj A Dadhania <nikunj@linux.vnet.ibm.com>
      [agraf: reduce RTAS_TOKEN_MAX]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2e14072f
  7. 27 6月, 2014 8 次提交
    • A
      spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB · 9a321e92
      Alexey Kardashevskiy 提交于
      Currently SPAPR PHB keeps track of all allocated MSI (here and below
      MSI stands for both MSI and MSIX) interrupt because
      XICS used to be unable to reuse interrupts. This is a problem for
      dynamic MSI reconfiguration which happens when guest reloads a driver
      or performs PCI hotplug. Another problem is that the existing
      implementation can enable MSI on 32 devices maximum
      (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.
      
      This makes use of new XICS ability to reuse interrupts.
      
      This reorganizes MSI information storage in sPAPRPHBState. Instead of
      static array of 32 descriptors (one per a PCI function), this patch adds
      a GHashTable when @config_addr is a key and (first_irq, num) pair is
      a value. GHashTable can dynamically grow and shrink so the initial limit
      of 32 devices is gone.
      
      This changes migration stream as @msi_table was a static array while new
      @msi_devs is a dynamic hash table. This adds temporary array which is
      used for migration, it is populated in "spapr_pci"::pre_save() callback
      and expanded into the hash table in post_load() callback. Since
      the destination side does not know the number of MSI-enabled devices
      in advance and cannot pre-allocate the temporary array to receive
      migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
      which allocates the array automatically.
      
      This resets the MSI configuration space when interrupts are released by
      the ibm,change-msi RTAS call.
      
      This fixed traces to be more informative.
      
      This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
      was incorrect by accident. As the internal representation changed,
      thus bumps migration version number.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      [agraf: drop g_malloc_n usage]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9a321e92
    • A
      spapr: Move interrupt allocator to xics · bee763db
      Alexey Kardashevskiy 提交于
      The current allocator returns IRQ numbers from a pool and does not
      support IRQs reuse in any form as it did not keep track of what it
      previously returned, it only keeps the last returned IRQ. Some use
      cases such as PCI hot(un)plug may require IRQ release and reallocation.
      
      This moves an allocator from SPAPR to XICS.
      
      This switches IRQ users to use new API.
      
      This uses LSI/MSI flags to know if interrupt is allocated.
      
      The interrupt release function will be posted as a separate patch.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bee763db
    • S
      spapr: Add RTAS sysparm SPLPAR Characteristics · 3b50d897
      Sam bobroff 提交于
      Add support for the SPLPAR Characteristics parameter to the emulated
      RTAS call ibm,get-system-parameter.
      
      The support provides just enough information to allow "cat
      /proc/powerpc/lparcfg" to succeed without generating a kernel error
      message.
      
      Without this patch the above command will produce the following kernel
      message: arch/powerpc/platforms/pseries/lparcfg.c \
      parse_system_parameter_string Error calling get-system-parameter \
      (0xfffffffd)
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3b50d897
    • S
      spapr: Add RTAS sysparm UUID · b907d7b0
      Sam bobroff 提交于
      Add support for the UUID parameter to the emulated RTAS call
      ibm,get-system-parameter.
      
      Return the guest's UUID as the value for the RTAS UUID system
      parameter, or null (a zero length result) if it is not set.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b907d7b0
    • S
      spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE · 3052d951
      Sam bobroff 提交于
      This allows the ibm,get-system-parameter RTAS call to succeed for the
      DIAGNOSTICS_RUN_MODE system parameter.
      
      The problem can be seen with "ppc64_cpu --run-mode" from the
      powerpc-utils package which fails before this patch with "Machine does
      not support diagnostic run mode".
      
      This is corrected by using the rtas_st_buffer() function to write to
      the buffer.
      
      The RTAS constants are also moved out into a header file, some new
      constants added and the surrounding code slightly simplified.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      [agraf: remove some commentary]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3052d951
    • S
      spapr: Add rtas_st_buffer utility function · ce3fa1ec
      Sam bobroff 提交于
      Add a function to write lengh + data into a buffer as required for the
      emulation of the RTAS ibm,get-system-parameter call.
      
      If the destination is smaller than the source, the write is truncated
      and success is returned. This matches the behaviour of pHyp.
      
      This will be used in following patches.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ce3fa1ec
    • A
      spapr_iommu: Make in-kernel TCE table optional · 9bb62a07
      Alexey Kardashevskiy 提交于
      POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
      TCE tables in the host kernel memory and handle H_PUT_TCE requests
      targeted to specific LIOBN (logical bus number) right in the host without
      switching to QEMU. At the moment this is used for emulated devices only
      and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
      handler finds a LIOBN and corresponding table, it will put a TCE to
      the table and complete hypercall execution. The user space will not be
      notified.
      
      Upcoming VFIO support is going to use the same sPAPRTCETable device class
      so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
      tables for VFIO are going to be allocated in the host as well.
      However VFIO operates with real IOMMU tables and simple copying of
      a TCE to the real hardware TCE table will not work as guest physical
      to host physical address translation is requited.
      
      So until the host kernel gets VFIO support for H_PUT_TCE, we better not
      to register VFIO's TCE in the host.
      
      This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
      in upstream yet and being discussed so now it is always false which means
      that in-kernel VFIO acceleration is not supported.
      
      This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
      that sPAPRTCETable should not try allocating TCE table in the host kernel
      for VFIO. The flag is false now as at the moment there is no VFIO.
      
      This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
      is the same. Since there is only emulated PCI and VIO now, the flag is set
      to false. Upcoming VFIO support will set it to true.
      
      This is a preparation patch so no change in behaviour is expected
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9bb62a07
    • A
      spapr: Fix RTAS token numbers · 3a3b8502
      Alexey Kardashevskiy 提交于
      At the moment spapr_rtas_register() allocates a new token number for every
      new RTAS callback so numbers are not fixed and depend on the number of
      supported RTAS handlers and the exact order of spapr_rtas_register() calls.
      These tokens are copied into the device tree and remain the same during
      the guest lifetime.
      
      When we start another guest to receive a migration, it calls
      spapr_rtas_register() as well. If the number of RTAS handlers or their
      order is different in QEMU on source and destination sides, the "/rtas"
      node in the device tree will differ. Since migration overwrites the device
      tree (as it overwrites the entire RAM), the actual RTAS config on
      the destination side gets broken.
      
      This defines global contant values for every RTAS token which QEMU
      is using today.
      
      This changes spapr_rtas_register() to accept a token number instead of
      allocating one. This changes all users of spapr_rtas_register().
      
      This changes XICS-KVM not to cache tokens registered with KVM as they
      constant now.
      
      This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
      a base. TOKEN_MAX is moved and renamed too and its value is changed
      to the last token + 1. Boundary checks for token values are adjusted.
      
      This reserves token numbers for "os-term" handlers and PCI hotplug
      which we are working on.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3a3b8502
  8. 16 6月, 2014 1 次提交