1. 09 3月, 2015 14 次提交
    • D
      pseries: Move rtc_offset into RTC device's state structure · 880ae7de
      David Gibson 提交于
      The initial creation of the PAPR RTC qdev class left a wart - the rtc's
      offset was left in the sPAPREnvironment structure, accessed via a global.
      
      This patch moves it into the RTC device's own state structure, were it
      belongs.  This requires a small change to the migration stream format.  In
      order to handle incoming streams from older versions, we also need to
      retain the rtc_offset field in the sPAPREnvironment structure, so that it
      can be loaded into via the vmsd, then pushed into the RTC device.
      
      Since we're changing the migration format, this also takes the opportunity
      to:
      
        * Change the rtc offset from a value in seconds to a value in
          nanoseconds, allowing nanosecond offsets between host and guest
          rtc time, if desired.
      
        * Remove both the already unused "next_irq" field and now unused
          "rtc_offset" field from the new version of the spapr migration
          stream
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      880ae7de
    • D
      pseries: Make the PAPR RTC a qdev device · 28df36a1
      David Gibson 提交于
      At present the PAPR RTC isn't a "device" as such - it's accessed only via
      firmware/hypervisor calls, and is handled in the sPAPR core code.  This
      becomes inconvenient as we extend it in various ways.
      
      This patch makes the PAPR RTC a separate device in the qemu device model.
      
      For now, the only piece of device state - the rtc_offset - is still kept in
      the global sPAPREnvironment structure.  That's clearly wrong, but leaving
      it to be fixed in a following patch makes for a clearer separation between
      the internal re-organization of the device, and the behavioural changes
      (because the migration stream format needs to change slightly when the
      offset is moved into the device's own state).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      28df36a1
    • D
      pseries: Make RTAS time of day functions respect -rtc options · f01c5d84
      David Gibson 提交于
      In the 'pseries' machine the real time clock is provided by a
      paravirtualized firmware interface rather than a device per se; the RTAS
      get-time-of-day and set-time-of-day calls.
      
      Out current implementations of those work directly off host time (with
      an offset), not respecting options such as clock=vm which can be
      specified in the -rtc command line option.
      
      This patch reworks the RTAS RTC code to respect those options, primarily
      by basing them on the qemu_clock_get_ns(rtc_clock) function instead of
      directly on qemu_get_timedate() (which essentially handles host time, not
      virtual rtc time).
      
      As a bonus, this means our get-time-of-day function now also returns
      nanoseconds.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f01c5d84
    • D
      pseries: Add spapr_rtc_read() helper function · e5dad1d7
      David Gibson 提交于
      The virtual RTC time is used in two places in the pseries machine.  First
      is in the RTAS get-time-of-day function which returns the RTC time to the
      guest.  Second is in the spapr events code which is used to timestamp
      event messages from the hypervisor to the guest.
      
      Currently both call qemu_get_timedate() directly, but we want to change
      that so we can properly handle the various -rtc options.  In preparation,
      create a helper function to return the virtual RTC time.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      e5dad1d7
    • D
      pseries: Add more parameter validation in RTAS time of day functions · bbade206
      David Gibson 提交于
      Currently, the RTAS time of day functions only partially validate the
      number of parameters they receive and return.  Because of how the
      parameters are used, this is unlikely to lead to a crash, but it's messy.
      
      This patch adds the missing checks.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      bbade206
    • D
      pseries: Move sPAPR RTC code into its own file · 12f42174
      David Gibson 提交于
      At the moment the RTAS (firmware/hypervisor) time of day functions are
      implemented in spapr_rtas.c along with a bunch of other things.  Since
      we're going to be expanding these a bit, move the RTAS RTC related code
      out into new file spapr_rtc.c.  Also add its own initialization function,
      spapr_rtc_init() called from the main machine init routine.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      12f42174
    • D
      Add more VMSTATE_*_TEST variants for integers · 87774a4a
      David Gibson 提交于
      Currently, vmstate.h includes helper macro variants for 8, 16 and 32-bit
      unsigned integers which include a "test" function which can selectively
      enable or disable the field's presence in the migration stream.
      
      There aren't similar helpers for 64-bit unsigned integers, or any size of
      signed integers.  This patch remedies this.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      87774a4a
    • D
      Generalize QOM publishing of date and time from mc146818rtc.c · 8e099d14
      David Gibson 提交于
      The mc146818rtc driver exposes the current RTC date and time via the "date"
      property in QOM (which is also aliased to the machine's "rtc-time"
      property).  Currently it uses a custom visitor function rtc_get_date to
      do this.
      
      This patch introduces new helpers to the QOM core to expose struct tm
      valued properties via a getter function, so that this functionality can be
      more easily duplicated in other RTC implementations.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8e099d14
    • A
      spapr-pci: Enable huge BARs · b194df47
      Alexey Kardashevskiy 提交于
      At the moment sPAPR only supports 512MB window for MMIO BARs. However
      modern devices might want bigger 64bit BARs.
      
      This extends MMIO window from 512MB to 62GB (aligned to
      SPAPR_PCI_WINDOW_SPACING) and advertises it in 2 records in
      the PHB "ranges" property. 32bit gets the space from
      SPAPR_PCI_MEM_WIN_BUS_OFFSET till the end of 4GB, 64bit gets the rest
      of the space. If no space is left, 64bit range is not advertised.
      
      The MMIO space size is set to old value of 0x20000000 by default
      for pseries machines older than 2.3.
      
      The approach changes the device tree which is a guest visible change, however
      it won't break migration as:
      1. we do not support migration to older QEMU versions
      2. migration to newer QEMU will migrate the device tree as well and since
      the new layout only extends the old one and does not change address mappigns,
      no breakage is expected here too.
      
      SLOF change is required to utilize this extension.
      Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b194df47
    • A
      spapr: Add pseries-2.3 machine · 3dab0244
      Alexey Kardashevskiy 提交于
      The next patch will make MMIO space bigger and keep the old value for
      older pseries machines.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3dab0244
    • D
      pseries: Limit PCI host bridge "index" value · 3e4ac968
      David Gibson 提交于
      pseries guests can have large numbers of PCI host bridges.  To avoid the
      user having to specify a number of different configuration values for every
      one, the device supports an "index" property which is a shorthand setting
      the various window and configuration addresses from a predefined sensible
      set.
      
      There are some problems with the details at present:
        * The "index" propery is signed, but negative values will create PCI
      windows below where we expect, potentially colliding with other devices
        * No limit is imposed on the "index" property and large values can
      translate to extremely large window addresses.  With PCI passthrough in
      particular this can mean we exceed various mapping and physical address
      limits causing the guest host bridge to not work in strange ways.
      
      This patch addresses this, by making "index" unsigned, and imposing a
      limit.  Currently the limit allows indices from 0..255 which is probably
      enough host bridges for the time being.  It's fairly easy to extend if
      we discover we need more.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      3e4ac968
    • A
      target-ppc: Use right page size with hash table lookup · ad3e67d0
      Aneesh Kumar K.V 提交于
      We look at two sizes specified in ISA (4K, 64K). If not found matching,
      we consider it 16MB.
      
      Without this patch we would fail to lookup address above 16MB range.
      Below 16MB happened to work before because the kernel have a liner
      mapping and we always looked up hash for 0xc000000000000000. The
      actual real address was computed by using the 16MB offset
      with the real address found with the above hash.
      
      Without Fix:
      (gdb) x/16x 0xc000000001000000
      0xc000000001000000 <list_entries+453208>:       Cannot access memory at address 0xc000000001000000
      (gdb)
      
      With Fix:
      (gdb)  x/16x 0xc000000001000000
      0xc000000001000000 <list_entries+453208>:       0x00000000      0x00000000      0x00000000      0x00000000
      0xc000000001000010 <list_entries+453224>:       0x00000000      0x00000000      0x00000000      0x00000000
      0xc000000001000020 <list_entries+453240>:       0x00000000      0x00000000      0x00000000      0x00000000
      0xc000000001000030 <list_entries+453256>:       0x00000000      0x00000000      0x00000000      0x00000000
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ad3e67d0
    • A
      spapr_vio/spapr_iommu: Move VIO bypass where it belongs · ee9a569a
      Alexey Kardashevskiy 提交于
      Instead of tweaking a TCE table device by adding there a bypass flag,
      let's add an alias to RAM and IOMMU memory region, and enable/disable
      those according to the selected bypass mode.
      This way IOMMU memory region can have size of the actual window rather
      than ram_size which is essential for upcoming DDW support.
      
      This moves bypass logic to VIO layer and keeps @bypass flag in TCE table
      for migration compatibility only. This replaces spapr_tce_set_bypass()
      calls with explicit assignment to avoid confusion as the function could
      do something more that just syncing the @bypass flag.
      
      This adds a pointer to VIO device into the sPAPRTCETable struct to provide
      the sPAPRTCETable device a way to update bypass mode for the VIO device.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ee9a569a
    • P
      Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging · 0048fa6c
      Peter Maydell 提交于
      pci, pc, virtio fixes and cleanups
      
      A bunch of fixes all over the place.
      All of ACPI refactoring has been merged.
      Legacy pci commands have been dropped.
      virtio header cleanup
      initial patches from virtio-1.0 branch
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      
      * remotes/mst/tags/for_upstream: (130 commits)
        acpi: drop unused code
        aml-build: comment fix
        acpi-build: fix typo in comment
        acpi: update generated files
        vhost user:support vhost user nic for non msi guests
        aml-build: fix build for glib < 2.22
        acpi: update generated files
        Makefile.target: binary depends on config-devices
        acpi-test-data: update after pci rewrite
        acpi, mem-hotplug: use PC_DIMM_SLOT_PROP in acpi_memory_plug_cb().
        pci-hotplug-old: Has been dead for five major releases, bury
        pci: Give a few helpers internal linkage
        acpi: make build_*() routines static to aml-build.c
        pc: acpi: remove not used anymore ssdt-[misc|pcihp].hex.generated blobs
        pc: acpi-build: drop template patching and create PCI bus tree dynamically
        tests: ACPI: update pc/SSDT.bridge due to new alg of PCI tree creation
        pc: acpi-build: simplify PCI bus tree generation
        tests: add ACPI blobs for qemu with bridge cases
        tests: bios-tables-test: add support for testing bridges
        tests: ACPI test blobs update due to PCI0._CRS changes
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      
      Conflicts:
      	hw/pci/pci-hotplug-old.c
      0048fa6c
  2. 08 3月, 2015 9 次提交
  3. 07 3月, 2015 3 次提交
  4. 05 3月, 2015 5 次提交
  5. 04 3月, 2015 6 次提交
  6. 03 3月, 2015 3 次提交
    • P
      vl: take iothread lock very early · 576a94d8
      Paolo Bonzini 提交于
      If the iothread lock isn't taken by the main thread, the RCU callbacks
      might run concurrently with the main thread.  QEMU's not ready for that.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Tested-by: NGonglei <arei.gonglei@huawei.com>
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      576a94d8
    • P
      Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging · 3180aadb
      Peter Maydell 提交于
      - more config options
      - bootdevice, iscsi, virtio-scsi fixes
      - build system patches for MinGW and config-devices.mak
      - qemu_mutex_lock_iothread deadlock fixes
      - another tiny patch from the record/replay series
      
      # gpg: Signature made Mon Mar  2 09:59:14 2015 GMT using RSA key ID 78C7AE83
      # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
      # gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
      # gpg: WARNING: This key is not certified with a trusted signature!
      # gpg:          There is no indication that the signature belongs to the owner.
      # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
      #      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83
      
      * remotes/bonzini/tags/for-upstream:
        cpus: be more paranoid in avoiding deadlocks
        cpus: fix deadlock and segfault in qemu_mutex_lock_iothread
        virtio-scsi: Allocate op blocker reason before blocking
        Makefile.target: binary depends on config-devices
        Makefile: don't silence mak file test with V=1
        Makefile: fix up parallel building under MSYS+MinGW
        iscsi: Handle write protected case in reopen
        Give ivshmem its own config option
        Create specific config option for "platform-bus"
        Add specific config options for PCI-E bridges
        bootdevice: fix segment fault when booting guest with '-kernel' and '-initrd'
        timer: replace time() with QEMU_CLOCK_HOST
        virtio-scsi-dataplane: Call blk_set_aio_context within BQL
        block: Forbid bdrv_set_aio_context outside BQL
        scsi: give device a parent before setting properties
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      3180aadb
    • L
      xhci: generate a Transfer Event for each Transfer TRB with the IOC bit set · aa685789
      Laszlo Ersek 提交于
      At the moment, when the XHCI driver in edk2
      (MdeModulePkg/Bus/Pci/XhciDxe/XhciDxe.inf) runs on QEMU, with the options
      
        -device nec-usb-xhci -device usb-kbd
      
      it crashes with:
      
        ASSERT MdeModulePkg/Bus/Pci/XhciDxe/XhciSched.c(1759):
        TrsRing != ((void*) 0)
      
      The crash hits in the following edk2 call sequence (all files under
      MdeModulePkg/Bus/):
      
      UsbEnumerateNewDev()                         [Usb/UsbBusDxe/UsbEnumer.c]
        UsbBuildDescTable()                        [Usb/UsbBusDxe/UsbDesc.c]
          UsbGetDevDesc()                          [Usb/UsbBusDxe/UsbDesc.c]
            UsbCtrlGetDesc(USB_REQ_GET_DESCRIPTOR) [Usb/UsbBusDxe/UsbDesc.c]
              UsbCtrlRequest()                     [Usb/UsbBusDxe/UsbDesc.c]
                UsbHcControlTransfer()             [Usb/UsbBusDxe/UsbUtility.c]
                  XhcControlTransfer()             [Pci/XhciDxe/Xhci.c]
                    XhcCreateUrb()                 [Pci/XhciDxe/XhciSched.c]
                      XhcCreateTransferTrb()       [Pci/XhciDxe/XhciSched.c]
                    XhcExecTransfer()              [Pci/XhciDxe/XhciSched.c]
                      XhcCheckUrbResult()          [Pci/XhciDxe/XhciSched.c]
                        //
                        // look for TRB_TYPE_DATA_STAGE event [1]
                        //
                    //
                    // Store a copy of the device descriptor, as the hub device
                    // needs this info to configure endpoint. [2]
                    //
        UsbSetConfig()                             [Usb/UsbBusDxe/UsbDesc.c]
          UsbCtrlRequest(USB_REQ_SET_CONFIG)       [Usb/UsbBusDxe/UsbDesc.c]
            UsbHcControlTransfer()                 [Usb/UsbBusDxe/UsbUtility.c]
              XhcControlTransfer()                 [Pci/XhciDxe/Xhci.c]
                XhcSetConfigCmd()                  [Pci/XhciDxe/XhciSched.c]
                  XhcInitializeEndpointContext()   [Pci/XhciDxe/XhciSched.c]
                    //
                    // allocate transfer ring for the endpoint [3]
                    //
      
      USBKeyboardDriverBindingStart()              [Usb/UsbKbDxe/EfiKey.c]
        UsbIoAsyncInterruptTransfer()              [Usb/UsbBusDxe/UsbBus.c]
          UsbHcAsyncInterruptTransfer()            [Usb/UsbBusDxe/UsbUtility.c]
            XhcAsyncInterruptTransfer()            [Pci/XhciDxe/Xhci.c]
              XhcCreateUrb()                       [Pci/XhciDxe/Xhci.c]
                XhcCreateTransferTrb()             [Pci/XhciDxe/XhciSched.c]
                  XhcSyncTrsRing()                 [Pci/XhciDxe/XhciSched.c]
                    ASSERT (TrsRing != NULL) [4]
      
      UsbEnumerateNewDev() in the USB bus driver issues a GET_DESCRIPTOR
      request, in order to determine the number of configurations that the
      endpoint supports. The requests consists of three stages (three TRBs),
      setup, data, and status. The length of the response is determined in [1],
      namely from the transfer event that the host controller generates in
      response to the request's middle stage (ie. the data stage).
      
      If the length of the answer is correct (a full GET_DESCRIPTOR request
      takes 18 bytes), then the XHCI driver that underlies the USB bus driver
      "snoops" (caches) the descriptor data for later [2].
      
      Later, the USB bus driver sends a SET_CONFIG request. The underlying XHCI
      driver allocates a transfer ring for the endpoint, relying on the data
      snooped and cached in step [2].
      
      Finally, the USB keyboard driver submits an asynchronous interrupt
      transfer to manage the keyboard. As part of this it asserts [4] that the
      ring has been allocated in step [3].
      
      And this ASSERT() fires. The root cause can be found in the way QEMU
      handles the initial GET_DESCRIPTOR request.
      
      Again, that request consists of three stages (TRBs, Transfer Request
      Blocks), "setup", "data", and "status". The XhcCreateTransferTrb()
      function sets the IOC ("Interrupt on Completion") flag in each of these
      TRBs.
      
      According to the XHCI specification, the host controller shall generate a
      Transfer Event in response to *each* individual TRB of the request that
      had the IOC flag set. This means that QEMU should queue three events:
      setup, data, and status, for edk2's XHCI driver.
      
      However, QEMU only generates two events:
      - one for the setup (ie. 1st) stage,
      - another for the status (ie. 3rd) stage.
      
      No event is generated for the middle (ie. data) stage. The loop in QEMU's
      xhci_xfer_report() function runs three times, but due to the "reported"
      variable, only the first and the last TRBs elicit events, the middle (data
      stage) results in no event queued.
      
      As a consequence:
      - When handling the GET_DESCRIPTOR request, XhcCheckUrbResult() in [1]
        does not update the response length from zero.
      
      - XhcControlTransfer() thinks that the response is invalid (it has zero
        length payload instead of 18 bytes), hence [2] is not reached; the
        device descriptor is not stashed for later, and the number of possible
        configurations is left at zero.
      
      - When handling the SET_CONFIG request, (NumConfigurations == 0) from
        above prevents the allocation of the endpoint's transfer ring.
      
      - When the keyboard driver tries to use the endpoint, the ASSERT() blows
        up.
      
      The solution is to correct the emulation in QEMU, and to generate a
      transfer event whenever IOC is set in a TRB.
      
      The patch replaces
      
        !reported && (IOC || foo)    == !reported && IOC ||
                                        !reported && foo
      
      with
      
        IOC || (!reported && foo)    == IOC ||
                                        !reported && foo
      
      which only changes how
      
        reported && IOC
      
      is handled. (Namely, it now generates an event.)
      
      Tested with edk2 built for "qemu-system-aarch64 -M virt" (ie.
      "ArmVirtualizationQemu.dsc", aka "AAVMF"), and guest Linux.
      Signed-off-by: NLaszlo Ersek <lersek@redhat.com>
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      aa685789