1. 20 11月, 2017 1 次提交
  2. 18 11月, 2017 4 次提交
  3. 16 11月, 2017 2 次提交
    • D
      NUMA: Enable adding NUMA node implicitly · 7b8be49d
      Dou Liyang 提交于
      Linux and Windows need ACPI SRAT table to make memory hotplug work properly,
      however currently QEMU doesn't create SRAT table if numa options aren't present
      on CLI.
      
      Which breaks both linux and windows guests in certain conditions:
       * Windows: won't enable memory hotplug without SRAT table at all
       * Linux: if QEMU is started with initial memory all below 4Gb and no SRAT table
         present, guest kernel will use nommu DMA ops, which breaks 32bit hw drivers
         when memory is hotplugged and guest tries to use it with that drivers.
      
      Fix above issues by automatically creating a numa node when QEMU is started with
      memory hotplug enabled but without '-numa' options on CLI.
      (PS: auto-create numa node only for new machine types so not to break migration).
      
      Which would provide SRAT table to guests without explicit -numa options on CLI
      and would allow:
       * Windows: to enable memory hotplug
       * Linux: switch to SWIOTLB DMA ops, to bounce DMA transfers to 32bit allocated
         buffers that legacy drivers/hw can handle.
      
      [Rewritten by Igor]
      Reported-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Suggested-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Marcel Apfelbaum <marcel@redhat.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Thomas Huth <thuth@redhat.com>
      Cc: Alistair Francis <alistair23@gmail.com>
      Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
      Cc: Izumi Taku <izumi.taku@jp.fujitsu.com>
      Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      7b8be49d
    • M
      hw/pci-host: Fix x86 Host Bridges 64bit PCI hole · 9fa99d25
      Marcel Apfelbaum 提交于
      Currently there is no MMIO range over 4G
      reserved for PCI hotplug. Since the 32bit PCI hole
      depends on the number of cold-plugged PCI devices
      and other factors, it is very possible is too small
      to hotplug PCI devices with large BARs.
      
      Fix it by reserving 2G for I4400FX chipset
      in order to comply with older Win32 Guest OSes
      and 32G for Q35 chipset.
      
      Even if the new defaults of pci-hole64-size will appear in
      "info qtree" also for older machines, the property was
      not implemented so no changes will be visible to guests.
      
      Note this is a regression since prev QEMU versions had
      some range reserved for 64bit PCI hotplug.
      Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
      Reviewed-by: NGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: NMarcel Apfelbaum <marcel@redhat.com>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      9fa99d25
  4. 15 11月, 2017 1 次提交
  5. 14 11月, 2017 2 次提交
    • E
      thread-posix: fix qemu_rec_mutex_trylock macro · 54113dd5
      Emilio G. Cota 提交于
      We never noticed because it has no users.
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1510273811-13419-1-git-send-email-cota@braap.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      54113dd5
    • G
      xics/kvm: synchonize state before 'info pic' · dcb556fc
      Greg Kurz 提交于
      When using the emulated XICS, the 'info pic' monitor command shows:
      
      CPU 0 XIRR=ff000000 ((nil)) PP=ff MFRR=ff
      ICS 1000..13ff 0x10040060340
        1000 MSI 05 00
        1001 MSI 05 00
        1002 MSI 05 00
        1003 MSI ff 00
        1004 LSI ff 00
        1005 LSI ff 00
        1006 LSI ff 00
        1007 LSI ff 00
        1008 MSI 05 00
        1009 MSI 05 00
        100a MSI 05 00
        100b MSI 05 00
        100c MSI 05 00
      
      but when using the in-kernel XICS with the very same guest, we get:
      
      CPU 0 XIRR=00000000 ((nil)) PP=ff MFRR=ff
      ICS 1000..13ff 0x10032e00340
        1000 MSI ff 00
        1001 MSI ff 00
        1002 MSI ff 00
        1003 MSI ff 00
        1004 LSI ff 00
        1005 LSI ff 00
        1006 LSI ff 00
        1007 LSI ff 00
        1008 MSI ff 00
        1009 MSI ff 00
        100a MSI ff 00
        100b MSI ff 00
        100c MSI ff 00
      
      ie, all irqs are masked and XIRR is null, while we should get the
      same output as with the emulated XICS.
      
      If the guest is then migrated, 'info pic' shows the expected values
      on both source and destination.
      
      The problem is that QEMU doesn't synchronize with KVM before printing
      the XICS state. Migration happens to fix the output because it enforces
      synchronization with KVM.
      
      To fix the invalid output of 'info pic', this patch introduces a new
      synchronize_state operation for both ICPStateClass and ICSStateClass.
      The ICP operation relies on run_on_cpu() in order to kick the vCPU
      and avoid sleeping on KVM_GET_ONE_REG.
      Signed-off-by: NGreg Kurz <groug@kaod.org>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      dcb556fc
  6. 13 11月, 2017 4 次提交
  7. 10 11月, 2017 1 次提交
  8. 09 11月, 2017 1 次提交
  9. 05 11月, 2017 1 次提交
  10. 01 11月, 2017 11 次提交
  11. 31 10月, 2017 5 次提交
  12. 30 10月, 2017 1 次提交
    • C
      s390x/kvm: use cpu model for gscb on compat machines · 0280b3eb
      Christian Borntraeger 提交于
      Starting a guest with
         <os>
          <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
        </os>
        <cpu mode='host-model'/>
      
      on an IBM z14 results in
      
      "qemu-system-s390x: Some features requested in the CPU model are not
      available in the configuration: gs"
      
      This is because guarded storage is fenced for compat machines that did
      not have guarded storage support. While this prevents future migration
      abort (by not starting the guest at all), not being able to start a
      "host-model" guest is very much unexpected.  As it turns out, even if we
      would modify libvirt to not expand the cpu model to contain "gs" for
      compat machines, it cannot guarantee that a migration will succeed. For
      example if the kernel changes its features (or the user has nested=1 on
      one host but not on the other) the migration will fail nevertheless.  So
      instead of fencing "gs" for machines <= 2.9 lets allow it for all
      machine types that support the CPU model. This will make "host-model"
      runnable all the time, while relying on the CPU model to reject invalid
      migration attempts. We also need to change the migration for guarded
      storage.
      Additional discussions about host-model are still pending but are out
      of scope of this patch.
      Suggested-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NCornelia Huck &lt;Cornelia Huck <cohuck@redhat.com>
      Acked-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
      0280b3eb
  13. 27 10月, 2017 3 次提交
  14. 26 10月, 2017 3 次提交
    • E
      block: Align block status requests · efa6e2ed
      Eric Blake 提交于
      Any device that has request_alignment greater than 512 should be
      unable to report status at a finer granularity; it may also be
      simpler for such devices to be guaranteed that the block layer
      has rounded things out to the granularity boundary (the way the
      block layer already rounds all other I/O out).  Besides, getting
      the code correct for super-sector alignment also benefits us
      for the fact that our public interface now has byte granularity,
      even though none of our drivers have byte-level callbacks.
      
      Add an assertion in blkdebug that proves that the block layer
      never requests status of unaligned sections, similar to what it
      does on other requests (while still keeping the generic helper
      in place for when future patches add a throttle driver).  Note
      that iotest 177 already covers this (it would fail if you use
      just the blkdebug.c hunk without the io.c changes).  Meanwhile,
      we can drop assertions in callers that no longer have to pass
      in sector-aligned addresses.
      
      There is a mid-function scope added for 'count' and 'longret',
      for a couple of reasons: first, an upcoming patch will add an
      'if' statement that checks whether a driver has an old- or
      new-style callback, and can conveniently use the same scope for
      less indentation churn at that time.  Second, since we are
      trying to get rid of sector-based computations, wrapping things
      in a scope makes it easier to group and see what will be
      deleted in a final cleanup patch once all drivers have been
      converted to the new-style callback.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      efa6e2ed
    • E
      block: Convert bdrv_get_block_status_above() to bytes · 31826642
      Eric Blake 提交于
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the name of the function from bdrv_get_block_status_above()
      to bdrv_block_status_above() ensures that the compiler enforces that
      all callers are updated.  Likewise, since it a byte interface allows
      an offset mapping that might not be sector aligned, split the mapping
      out of the return value and into a pass-by-reference parameter.  For
      now, the io.c layer still assert()s that all uses are sector-aligned,
      but that can be relaxed when a later patch implements byte-based
      block status in the drivers.
      
      For the most part this patch is just the addition of scaling at the
      callers followed by inverse scaling at bdrv_block_status(), plus
      updates for the new split return interface.  But some code,
      particularly bdrv_block_status(), gets a lot simpler because it no
      longer has to mess with sectors.  Likewise, mirror code no longer
      computes s->granularity >> BDRV_SECTOR_BITS, and can therefore drop
      an assertion about alignment because the loop no longer depends on
      alignment (never mind that we don't really have a driver that
      reports sub-sector alignments, so it's not really possible to test
      the effect of sub-sector mirroring).  Fix a neighboring assertion to
      use is_power_of_2 while there.
      
      For ease of review, bdrv_get_block_status() was tackled separately.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      31826642
    • E
      block: Convert bdrv_get_block_status() to bytes · 237d78f8
      Eric Blake 提交于
      We are gradually moving away from sector-based interfaces, towards
      byte-based.  In the common case, allocation is unlikely to ever use
      values that are not naturally sector-aligned, but it is possible
      that byte-based values will let us be more precise about allocation
      at the end of an unaligned file that can do byte-based access.
      
      Changing the name of the function from bdrv_get_block_status() to
      bdrv_block_status() ensures that the compiler enforces that all
      callers are updated.  For now, the io.c layer still assert()s that
      all callers are sector-aligned, but that can be relaxed when a later
      patch implements byte-based block status in the drivers.
      
      There was an inherent limitation in returning the offset via the
      return value: we only have room for BDRV_BLOCK_OFFSET_MASK bits, which
      means an offset can only be mapped for sector-aligned queries (or,
      if we declare that non-aligned input is at the same relative position
      modulo 512 of the answer), so the new interface also changes things to
      return the offset via output through a parameter by reference rather
      than mashed into the return value.  We'll have some glue code that
      munges between the two styles until we finish converting all uses.
      
      For the most part this patch is just the addition of scaling at the
      callers followed by inverse scaling at bdrv_block_status(), coupled
      with the tweak in calling convention.  But some code, particularly
      bdrv_is_allocated(), gets a lot simpler because it no longer has to
      mess with sectors.
      
      For ease of review, bdrv_get_block_status_above() will be tackled
      separately.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      237d78f8