提交 · 758ead31c7e17bf17a9ef2e0ca1c3e86ab296b43 · openeuler / qemu

16 11月, 2017 2 次提交

NUMA: Enable adding NUMA node implicitly · 7b8be49d

由 Dou Liyang 提交于 11月 14, 2017

Linux and Windows need ACPI SRAT table to make memory hotplug work properly,
however currently QEMU doesn't create SRAT table if numa options aren't present
on CLI.

Which breaks both linux and windows guests in certain conditions:
 * Windows: won't enable memory hotplug without SRAT table at all
 * Linux: if QEMU is started with initial memory all below 4Gb and no SRAT table
   present, guest kernel will use nommu DMA ops, which breaks 32bit hw drivers
   when memory is hotplugged and guest tries to use it with that drivers.

Fix above issues by automatically creating a numa node when QEMU is started with
memory hotplug enabled but without '-numa' options on CLI.
(PS: auto-create numa node only for new machine types so not to break migration).

Which would provide SRAT table to guests without explicit -numa options on CLI
and would allow:
 * Windows: to enable memory hotplug
 * Linux: switch to SWIOTLB DMA ops, to bounce DMA transfers to 32bit allocated
   buffers that legacy drivers/hw can handle.

[Rewritten by Igor]
Reported-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Suggested-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Alistair Francis <alistair23@gmail.com>
Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
Cc: Izumi Taku <izumi.taku@jp.fujitsu.com>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

7b8be49d

hw/pci-host: Fix x86 Host Bridges 64bit PCI hole · 9fa99d25

由 Marcel Apfelbaum 提交于 11月 11, 2017

Currently there is no MMIO range over 4G
reserved for PCI hotplug. Since the 32bit PCI hole
depends on the number of cold-plugged PCI devices
and other factors, it is very possible is too small
to hotplug PCI devices with large BARs.

Fix it by reserving 2G for I4400FX chipset
in order to comply with older Win32 Guest OSes
and 32G for Q35 chipset.

Even if the new defaults of pci-hole64-size will appear in
"info qtree" also for older machines, the property was
not implemented so no changes will be visible to guests.

Note this is a regression since prev QEMU versions had
some range reserved for 64bit PCI hotplug.
Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
Reviewed-by: NGerd Hoffmann <kraxel@redhat.com>
Signed-off-by: NMarcel Apfelbaum <marcel@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

9fa99d25

05 11月, 2017 1 次提交

pci-assign: Remove · ab37bfc7

由 Paolo Bonzini 提交于 10月 20, 2017

Legacy PCI device assignment has been removed from Linux in 4.12,
and had been deprecated 2 years ago there.  We can remove it from
QEMU as well.

The ROM loading code was shared with Xen PCI passthrough, so move
it to hw/xen.
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ab37bfc7

27 10月, 2017 2 次提交

x86: Skip check apic_id_limit for Xen · 1a26f466

由 Lan Tianyu 提交于 8月 15, 2017

Xen vIOMMU device model will be in Xen hypervisor. Skip vIOMMU
check for Xen here when vcpu number is more than 255.
Signed-off-by: NLan Tianyu <tianyu.lan@intel.com>
Message-Id: <1502842933-8323-1-git-send-email-tianyu.lan@intel.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

1a26f466

xen: Log errno rather than return value · 7cdcca72

由 Ross Lagerwall 提交于 10月 11, 2017

xen_modified_memory() sets errno to communicate what went wrong so log
this rather than the return value which is not interesting.
Signed-off-by: NRoss Lagerwall <ross.lagerwall@citrix.com>
Acked-by: NAnthony PERARD <anthony.perard@citrix.com>
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>

7cdcca72

15 10月, 2017 4 次提交

pc: remove useless hot_add_cpu initialisation · 46202d85

由 Laurent Vivier 提交于 10月 10, 2017

Since 4458fb3a (pc: Eliminate pc_default_machine_options()),
hot_add_cpu is set in pc_machine_class_init(), so we don't
need to set it in pc_q35_machine_options(), pc_i440fx_machine_options()
and xenfv_machine_options(), except to clear it in
pc_i440fx_1_4_machine_opt().
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Acked-by: NAnthony PERARD <anthony.perard@citrix.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

46202d85

isapc: Remove unnecessary migration compatibility code · b5dac424

由 Eduardo Habkost 提交于 10月 06, 2017

We don't touch isapc when we change guest ABI and add new entries
to PC_COMPAT_* or new PCMachineClass compat flags.  This means
isapc never guaranteed guest ABI and cross-QEMU-version live
migration compatibility.  There's no point in keeping code for
kvm-pv-eoi and APIC ID compatibility in pc_init_isa().
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

b5dac424

pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devices · fd3b02c8

由 Eduardo Habkost 提交于 9月 27, 2017

Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
TYPE_PCI_DEVICE, except:

1) The ones that already have INTERFACE_PCIE_DEVICE set:

* base-xhci
* e1000e
* nvme
* pvscsi
* vfio-pci
* virtio-pci
* vmxnet3

2) base-pci-bridge

Not all PCI bridges are Conventional PCI devices, so
INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
that are actually Conventional PCI:

* dec-21154-p2p-bridge
* i82801b11-bridge
* pbm-bridge
* pci-bridge

The direct subtypes of base-pci-bridge not touched by this patch
are:

* xilinx-pcie-root: Already marked as PCIe-only.
* pcie-pci-bridge: Already marked as PCIe-only.
* pcie-port: all non-abstract subtypes of pcie-port are already
  marked as PCIe-only devices.

3) megasas-base

Not all megasas devices are Conventional PCI devices, so the
interface names are added to the subclasses registered by
megasas_register_types(), according to information in the
megasas_devices[] array.

"megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".
Acked-by: NAlberto Garcia <berto@igalia.com>
Acked-by: NJohn Snow <jsnow@redhat.com>
Acked-by: NAnthony PERARD <anthony.perard@citrix.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NMarcel Apfelbaum <marcel@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

fd3b02c8

fw_cfg: add write callback · 5f9252f7

由 Marc-André Lureau 提交于 9月 11, 2017

Reintroduce the write callback that was removed when write support was
removed in commit 023e3148.

Contrary to the previous callback implementation, the write_cb
callback is called whenever a write happened, so handlers must be
ready to handle partial write as necessary.
Signed-off-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

5f9252f7

12 10月, 2017 1 次提交

pc: make sure that plugged CPUs are of the same type · 6970c5ff

由 Igor Mammedov 提交于 10月 10, 2017

heterogeneous cpus are not supported and hotplugging different
cpu model crashes QEMU:

  qemu-system-x86_64 -cpu qemu64 -smp 1,maxcpus=2
  (qemu) device_add host-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,id=foo
  (qemu) info cpus
  error: failed to get MSR 0x38d
  qemu-system-x86_64: target/i386/kvm.c:2121: kvm_get_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
  Aborted (core dumped)

Gracefully fail hotplug process in case of user mistake.
Reported-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Message-Id: <1507638879-200718-1-git-send-email-imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6970c5ff

02 10月, 2017 1 次提交

kvmclock: use the updated system_timer_msr · 346b1215

由 Jim Somerville 提交于 9月 29, 2017

Fixes e2b6c171 (kvmclock: update system_time_msr address forcibly)
which makes a call to get the latest value of the address
stored in system_timer_msr, but then uses the old address anyway.
Signed-off-by: NJim Somerville <Jim.Somerville@windriver.com>
Message-Id: <59b67db0bd15a46ab47c3aa657c81a4c11f168ea.1506702472.git.Jim.Somerville@windriver.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

346b1215

27 9月, 2017 1 次提交

migration: pre_save return int · 44b1ff31

由 Dr. David Alan Gilbert 提交于 9月 25, 2017

Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.

Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.

Note: If you add an error exit in your pre_save you must emit
an error_report to say why.
Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NJuan Quintela <quintela@redhat.com>
Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>

44b1ff31

20 9月, 2017 2 次提交

hw/acpi-build: Fix SRAT memory building in case of node 0 without RAM · 4926403c

由 Eduardo Habkost 提交于 9月 01, 2017

Currently, Using the fisrt node without memory on the machine makes
QEMU unhappy. With this example command line:
  ... \
  -m 1024M,slots=4,maxmem=32G \
  -numa node,nodeid=0 \
  -numa node,mem=1024M,nodeid=1 \
  -numa node,nodeid=2 \
  -numa node,nodeid=3 \
Guest reports "No NUMA configuration found" and the NUMA topology is
wrong.

This is because when QEMU builds ACPI SRAT, it regards node 0 as the
default node to deal with the memory hole(640K-1M). this means the
node0 must have some memory(>1M), but, actually it can have no
memory.

Fix this problem by cut out the 640K hole in the same way the PCI
4G hole does.
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Message-Id: <1504231805-30957-2-git-send-email-douly.fnst@cn.fujitsu.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

4926403c

numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed · 79e07936

由 Igor Mammedov 提交于 6月 01, 2017

Calculating default node-ids for CPUs in possible_cpu_arch_ids()
is rather fragile since defaults calculation uses nb_numa_nodes but
callback might be potentially called early before all -numa CLI
options are parsed, which would lead to cpus assigned only upto
nb_numa_nodes at the time possible_cpu_arch_ids() is called.

Issue was introduced by
(7c88e65d numa: mirror cpu to node mapping in MachineState::possible_cpus)
and for example CLI:
  -smp 4 -numa node,cpus=0 -numa node
would set props.node-id in possible_cpus array for every non
explicitly mapped CPU to the first node.

Issue is not visible to guest nor to mgmt interface due to
  1) implictly mapped cpus are forced to the first node in
     case of partial mapping
  2) in case of default mapping possible_cpu_arch_ids() is
     called after all -numa options are parsed (resulting
     in correct mapping).

However it's fragile to rely on late execution of
possible_cpu_arch_ids(), therefore add machine specific
callback that returns node-id for CPU and use it to calculate/
set defaults at machine_numa_finish_init() time when all -numa
options are parsed.
Reported-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Message-Id: <1496314408-163972-1-git-send-email-imammedo@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

79e07936

19 9月, 2017 6 次提交

General warn report fixups · b62e39b4

由 Alistair Francis 提交于 9月 11, 2017

Tidy up some of the warn_report() messages after having converted them
to use warn_report().
Signed-off-by: NAlistair Francis <alistair.francis@xilinx.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Message-Id: <9cb1d23551898c9c9a5f84da6773e99871285120.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b62e39b4

Convert multi-line fprintf() to warn_report() · 8297be80

由 Alistair Francis 提交于 9月 11, 2017

Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.

All of the warnings were changed using these commands:
  find ./* -type f -exec sed -i \
    'N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +

Indentation fixed up manually afterwards.

Some of the lines were manually edited to reduce the line length to below
80 charecters. Some of the lines with newlines in the middle of the
string were also manually edit to avoid checkpatch errrors.

The #include lines were manually updated to allow the code to compile.

Several of the warning messages can be improved after this patch, to
keep this patch mechanical this has been moved into a later patch.
Signed-off-by: NAlistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8297be80

Convert single line fprintf(.../n) to warn_report() · 2ab4b135

由 Alistair Francis 提交于 9月 11, 2017

Convert all the single line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.

All of the warnings were changed using this command:
  find ./* -type f -exec sed -i \
    's|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig' \
    {} +

Some of the lines were manually edited to reduce the line length to below
80 charecters.

The #include lines were manually updated to allow the code to compile.
Signed-off-by: NAlistair Francis <alistair.francis@xilinx.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com> [mips]
Message-Id: <ae8f8a7f0a88ded61743dff2adade21f8122a9e7.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2ab4b135

hw/i386: Improve some of the warning messages · 9e5d2c52

由 Alistair Francis 提交于 9月 11, 2017

Signed-off-by: NAlistair Francis <alistair.francis@xilinx.com>
Suggested-by: NEduardo Habkost <ehabkost@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1d6ef2ccd9667878ed5820fcf17eef35957ea5d8.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9e5d2c52

multiboot: validate multiboot header address values · ed4f86e8

由 Prasad J Pandit 提交于 9月 07, 2017

While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. These addresses are used to compute kernel
size and kernel text offset in the OS image. Validate these
address values to avoid an OOB access issue.

This is CVE-2017-14167.
Reported-by: NThomas Garnier <thgarnie@google.com>
Signed-off-by: NPrasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170907063256.7418-1-ppandit@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ed4f86e8

pc: use generic cpu_model parsing · 311ca98d

由 Igor Mammedov 提交于 9月 13, 2017

define default CPU type in generic way in pc_machine_class_init()
and let common machine code to handle cpu_model parsing

Patch also introduces TARGET_DEFAULT_CPU_TYPE define for 2 purposes:
  * make foo_machine_class_init() look uniform on every target
  * use define in [bsd|linux]-user targets to pick default
    cpu type
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1505318697-77161-5-git-send-email-imammedo@redhat.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

311ca98d

08 9月, 2017 3 次提交

intel_iommu: fix missing BQL in pt fast path · 66a4a031

由 Peter Xu 提交于 8月 17, 2017

In vtd_switch_address_space() we did the memory region switch, however
it's possible that the caller of it has not taken the BQL at all. Make
sure we have it.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

66a4a031

hw/acpi: Move acpi_set_pci_info to pcihp · ab938ae4

由 Anthony PERARD 提交于 9月 06, 2017

HW part of ACPI PCI hotplug in QEMU depends on ACPI_PCIHP_PROP_BSEL
being set on a PCI bus that supports ACPI hotplug. It should work
regardless of the source of ACPI tables (QEMU generator/legacy SeaBIOS/Xen).
So move ACPI_PCIHP_PROP_BSEL initialization into HW ACPI implementation
part from QEMU's ACPI table generator.

To do PCI passthrough with Xen, the property ACPI_PCIHP_PROP_BSEL needs
to be set, but this was done only when ACPI tables are built which is
not needed for a Xen guest. The need for the property starts with commit
"pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice"
(f0c9d64a).

Adding find_i440fx into stubs so that mips-softmmu target can be built.
Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: NAnthony PERARD <anthony.perard@citrix.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

ab938ae4

pc: add 2.11 machine types · a6fd5b0e

由 Marcel Apfelbaum 提交于 9月 06, 2017

Signed-off-by: NMarcel Apfelbaum <marcel@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

a6fd5b0e

31 8月, 2017 1 次提交

i386: replace g_malloc()+memcpy() with g_memdup() · ad2c1993

由 Marc-André Lureau 提交于 6月 07, 2017

I found these pattern via grepping the source tree. I don't have a
coccinelle script for it!
Signed-off-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>

ad2c1993

23 8月, 2017 1 次提交

numa: Move numa_legacy_auto_assign_ram to pc-i440fx-2.9 · 3da2bd8c

由 Eduardo Habkost 提交于 8月 18, 2017

The 'm->numa_auto_assign_ram = numa_legacy_auto_assign_ram;' line
was supposed to be in pc_i440fx_2_9_machine_options() (see commit
3bfe5716 "numa: equally distribute memory on nodes"), but the
merge commit adb354dd ("Merge remote-tracking branch
'mst/tags/for_upstream' into staging") moved it to the
pc_i440fx_2_10_machine_options().

Move the line back to pc_i440fx_2_9_machine_options().
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Message-id: 20170818190943.23858-1-ehabkost@redhat.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

3da2bd8c

22 8月, 2017 1 次提交

hw/ppc/spapr: Fix segfault when instantiating a 'pc-dimm' without 'memdev' · 04790978

由 Thomas Huth 提交于 8月 21, 2017

QEMU currently crashes when trying to use a 'pc-dimm' on the pseries
machine without specifying its 'memdev' property. This happens because
pc_dimm_get_memory_region() does not check whether the 'memdev' property
has properly been set by the user. Looking closer at this function, it's
also obvious that it is using &error_abort to call another function - and
this is bad in a function that is used in the hot-plugging calling chain
since this can also cause QEMU to exit unexpectedly.

So let's fix these issues in a proper way now: Add a "Error **errp"
parameter to pc_dimm_get_memory_region() which we use in case the 'memdev'
property has not been set by the user, and which we can use instead of
the &error_abort, and change the callers of get_memory_region() to make
use of this "errp" parameter for proper error checking.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

04790978

08 8月, 2017 1 次提交

hw/i386: allow SHPC for Q35 machine · a41c78c1

由 Aleksandr Bezzubikov 提交于 7月 29, 2017

Unmask previously masked SHPC feature in _OSC method.
Signed-off-by: NAleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: NMarcel Apfelbaum <marcel@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

a41c78c1

02 8月, 2017 4 次提交

pc: acpi: force FADT rev1 for 440fx based machine types · 3a3fcc75

由 Igor Mammedov 提交于 7月 24, 2017

w2k used to boot on QEMU until revision of FADT has
been bumped to rev3
(commit 77af8a2b hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest OS support.)

Keep PC machine at rev1 to remain compatible and Q35
at rev3 where w2k isn't supported anyway so OSX could
run as well.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Tested-by: NJohn Arbuckle <programmingkidx@gmail.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

3a3fcc75

pc: make 'pc.rom' readonly when machine has PCI enabled · 208fa0e4

由 Igor Mammedov 提交于 7月 28, 2017

looking at bios ROM mapping in QEMU it seems that only isapc
(i.e. not PCI enabled machine) requires ROM being mapped as
RW in other cases BIOS is mapped as RO. Do the same for option
ROM 'pc.rom' when machine has PCI enabled.

As useful side-effect pc.rom MemoryRegion stops being
put in vhost memory map (filtered out by vhost_section()),
which reduces number of entries by 1.

Coincidentally it fixes migration failure reported in

"[PATCH V2]  vhost: fix a migration failed because of vhost region merge"

where following destination CLI with /sys/module/vhost/parameters/max_mem_regions = 8

export DIMMSCOUNT=6
QEMU -enable-kvm \
     -netdev type=tap,id=guest0,vhost=on,script=no,vhostforce \
     -device virtio-net-pci,netdev=guest0 \
     -m 256,slots=256,maxmem=2G \
     `i=0; while [ $i -lt $DIMMSCOUNT ]; do echo \
         "-object memory-backend-ram,id=m$i,size=128M \
          -device pc-dimm,id=d$i,memdev=m$i"; i=$(($i + 1)); \
     done`

will fail to startup with error:

 "-device pc-dimm,id=d5,memdev=m5: a used vhost backend has no free memory slots left"

while it's possible to add the 6th DIMM during hotplug
on source.

Issue is caused by the fact that number of entries in vhost map
is bigger on 1 entry, when -device is processed, than
after guest boots up, and that offending entry belongs to
'pc.rom', it's not like vhost intends to do IO in ROM range
so making it RO hides region from vhost and makes number
of entries in vhost memory map at -device/machine_done time
match number of entries after guest boots.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reported-by: NPeng Hao <peng.hao2@zte.com.cn>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

208fa0e4

intel_iommu: use access_flags for iotlb · 07f7b733

由 Peter Xu 提交于 7月 17, 2017

It was cached by read/write separately. Let's merge them.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

07f7b733

intel_iommu: fix iova for pt · 892721d9

由 Peter Xu 提交于 7月 17, 2017

IOMMUTLBEntry.iova is returned incorrectly on one PT path (though mostly
we cannot really trigger this path, even if we do, we are mostly
disgarding this value, so it didn't break anything). Fix it by
converting the VTD_PAGE_MASK into the correct definition
VTD_PAGE_MASK_4K, then remove VTD_PAGE_MASK.

Fixes: b93130 ("intel_iommu: cleanup vtd_{do_}iommu_translate()")
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

892721d9

01 8月, 2017 2 次提交

trace-events: fix code style: print 0x before hex numbers · 8908eb1a

由 Vladimir Sementsov-Ogievskiy 提交于 7月 31, 2017

The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.

This patch is made by the following:

> find . -name trace-events | xargs python script.py

where script.py is the following python script:
=========================
 #!/usr/bin/env python

import sys
import re
import fileinput

rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)

files = sys.argv[1:]

for fname in files:
    for line in fileinput.input(fname, inplace=True):
        arr = re.split(rgroup, line)
        for i in range(0, len(arr), 2):
            arr[i] = re.sub(rbad, '0x\g<0>', arr[i])

        sys.stdout.write(''.join(arr))
=========================
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

8908eb1a

trace-events: fix code style: %# -> 0x% · db73ee4b

由 Vladimir Sementsov-Ogievskiy 提交于 7月 31, 2017

In trace format '#' flag of printf is forbidden. Fix it to '0x%'.

This patch is created by the following:

check that we have a problem
> find . -name trace-events | xargs grep '%#' | wc -l
56

check that there are no cases with additional printf flags before '#'
> find . -name trace-events | xargs grep "%[-+ 0'I]+#" | wc -l
0

check that there are no wrong usage of '#' and '0x' together
> find . -name trace-events | xargs grep '0x%#' | wc -l
0

fix the problem
> find . -name trace-events | xargs sed -i 's/%#/0x%/g'

[Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep
"%[-+ 0'I]+#" instead so the shell quoting is correct.
--Stefan]
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>

db73ee4b

31 7月, 2017 1 次提交

docs: fix broken paths to docs/devel/tracing.txt · 87e0331c

由 Philippe Mathieu-Daudé 提交于 7月 28, 2017

With the move of some docs/ to docs/devel/ on ac06724a,
no references were updated.
Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

87e0331c

22 7月, 2017 2 次提交

xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause... · 7fb394ad

由 Alexey G 提交于 7月 21, 2017

xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings

Under certain circumstances normal xen-mapcache functioning may be broken
by guest's actions. This may lead to either QEMU performing exit() due to
a caught bad pointer (and with QEMU process gone the guest domain simply
appears hung afterwards) or actual use of the incorrect pointer inside
QEMU address space -- a write to unmapped memory is possible. The bug is
hard to reproduce on a i440 machine as multiple DMA sources are required
(though it's possible in theory, using multiple emulated devices), but can
be reproduced somewhat easily on a Q35 machine using an emulated AHCI
controller -- each NCQ queue command slot may be used as an independent
DMA source ex. using READ FPDMA QUEUED command, so a single storage
device on the AHCI controller port will be enough to produce multiple DMAs
(up to 32). The detailed description of the issue follows.

Xen-mapcache provides an ability to map parts of a guest memory into
QEMU's own address space to work with.

There are two types of cache lookups:
 - translating a guest physical address into a pointer in QEMU's address
   space, mapping a part of guest domain memory if necessary (while trying
   to reduce a number of such (re)mappings to a minimum)
 - translating a QEMU's pointer back to its physical address in guest RAM

These lookups are managed via two linked-lists of structures.
MapCacheEntry is used for forward cache lookups, while MapCacheRev -- for
reverse lookups.

Every guest physical address is broken down into 2 parts:
    address_index  = phys_addr >> MCACHE_BUCKET_SHIFT;
    address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1);

MCACHE_BUCKET_SHIFT depends on a system (32/64) and is equal to 20 for
a 64-bit system (which assumed for the further description). Basically,
this means that we deal with 1 MB chunks and offsets within those 1 MB
chunks. All mappings are created with 1MB-granularity, i.e. 1MB/2MB/3MB
etc. Most DMA transfers typically are less than 1MB, however, if the
transfer crosses any 1MB border(s) - than a nearest larger mapping size
will be used, so ex. a 512-byte DMA transfer with the start address
700FFF80h will actually require a 2MB range.

Current implementation assumes that MapCacheEntries are unique for a given
address_index and size pair and that a single MapCacheEntry may be reused
by multiple requests -- in this case the 'lock' field will be larger than
1. On other hand, each requested guest physical address (with 'lock' flag)
is described by each own MapCacheRev. So there may be multiple MapCacheRev
entries corresponding to a single MapCacheEntry. The xen-mapcache code
uses MapCacheRev entries to retrieve the address_index & size pair which
in turn used to find a related MapCacheEntry. The 'lock' field within
a MapCacheEntry structure is actually a reference counter which shows
a number of corresponding MapCacheRev entries.

The bug lies in ability for the guest to indirectly manipulate with the
xen-mapcache MapCacheEntries list via a special sequence of DMA
operations, typically for storage devices. In order to trigger the bug,
guest needs to issue DMA operations in specific order and timing.
Although xen-mapcache is protected by the mutex lock -- this doesn't help
in this case, as the bug is not due to a race condition.

Suppose we have 3 DMA transfers, namely A, B and C, where
- transfer A crosses 1MB border and thus uses a 2MB mapping
- transfers B and C are normal transfers within 1MB range
- and all 3 transfers belong to the same address_index

In this case, if all these transfers are to be executed one-by-one
(without overlaps), no special treatment necessary -- each transfer's
mapping lock will be set and then cleared on unmap before starting
the next transfer.
The situation changes when DMA transfers overlap in time, ex. like this:

  |===== transfer A (2MB) =====|

              |===== transfer B (1MB) =====|

                          |===== transfer C (1MB) =====|
 time --->

In this situation the following sequence of actions happens:

1. transfer A creates a mapping to 2MB area (lock=1)
2. transfer B (1MB) tries to find available mapping but cannot find one
   because transfer A is still in progress, and it has 2MB size + non-zero
   lock. So transfer B creates another mapping -- same address_index,
   but 1MB size.
3. transfer A completes, making 1st mapping entry available by setting its
   lock to 0
4. transfer C starts and tries to find available mapping entry and sees
   that 1st entry has lock=0, so it uses this entry but remaps the mapping
   to a 1MB size
5. transfer B completes and by this time
  - there are two locked entries in the MapCacheEntry list with the SAME
    values for both address_index and size
  - the entry for transfer B actually resides farther in list while
    transfer C's entry is first
6. xen_ram_addr_from_mapcache() for transfer B gets correct address_index
   and size pair from corresponding MapCacheRev entry, but then it starts
   looking for MapCacheEntry with these values and finds the first entry
   -- which belongs to transfer C.

At this point there may be following possible (bad) consequences:

1. xen_ram_addr_from_mapcache() will use a wrong entry->vaddr_base value
   in this statement:

   raddr = (reventry->paddr_index << MCACHE_BUCKET_SHIFT) +
       ((unsigned long) ptr - (unsigned long) entry->vaddr_base);

resulting in an incorrent raddr value returned from the function. The
(ptr - entry->vaddr_base) expression may produce both positive and negative
numbers and its actual value may differ greatly as there are many
map/unmap operations take place. If the value will be beyond guest RAM
limits then a "Bad RAM offset" error will be triggered and logged,
followed by exit() in QEMU.

2. If raddr value won't exceed guest RAM boundaries, the same sequence
of actions will be performed for xen_invalidate_map_cache_entry() on DMA
unmap, resulting in a wrong MapCacheEntry being unmapped while DMA
operation which uses it is still active. The above example must
be extended by one more DMA transfer in order to allow unmapping as the
first mapping in the list is sort of resident.

The patch modifies the behavior in which MapCacheEntry's are added to the
list, avoiding duplicates.
Signed-off-by: NAlexey Gerasimenko <x1917x@gmail.com>
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>

7fb394ad

xen: fix compilation on 32-bit hosts · 9e6bdb92

由 Igor Druzhinin 提交于 7月 21, 2017

Signed-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>

9e6bdb92

19 7月, 2017 4 次提交

xen: don't use xenstore to save/restore physmap anymore · 331b5189

由 Igor Druzhinin 提交于 7月 10, 2017

If we have a system with xenforeignmemory_map2() implemented
we don't need to save/restore physmap on suspend/restore
anymore. In case we resume a VM without physmap - try to
recreate the physmap during memory region restore phase and
remap map cache entries accordingly. The old code is left
for compatibility reasons.
Signed-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: NPaul Durrant <paul.durrant@citrix.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>

331b5189

xen/mapcache: introduce xen_replace_cache_entry() · 5ba3d756

由 Igor Druzhinin 提交于 7月 10, 2017

This new call is trying to update a requested map cache entry
according to the changes in the physmap. The call is searching
for the entry, unmaps it and maps again at the same place using
a new guest address. If the mapping is dummy this call will
make it real.

This function makes use of a new xenforeignmemory_map2() call
with an extended interface that was recently introduced in
libxenforeignmemory [1].

[1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg113007.htmlSigned-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: NPaul Durrant <paul.durrant@citrix.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>

5ba3d756

xen/mapcache: add an ability to create dummy mappings · 75923565

由 Igor Druzhinin 提交于 7月 10, 2017

Dummys are simple anonymous mappings that are placed instead
of regular foreign mappings in certain situations when we need
to postpone the actual mapping but still have to give a
memory region to QEMU to play with.

This is planned to be used for restore on Xen.
Signed-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: NPaul Durrant <paul.durrant@citrix.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>

75923565

xen: move physmap saving into a separate function · 697b66d0

由 Igor Druzhinin 提交于 7月 10, 2017

Non-functional change.
Signed-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Reviewed-by: NPaul Durrant <paul.durrant@citrix.com>

697b66d0