提交 · fa876bc97a9d6f95c73db56587c64380aa282fe4 · openeuler / qemu

21 6月, 2018 19 次提交

address_space_access_valid: address_space_to_flatview needs RCU lock · fa876bc9

由 Paolo Bonzini 提交于 3月 05, 2018

address_space_access_valid is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_access_valid to address_space_access_valid.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 11e732a5)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

fa876bc9

address_space_read: address_space_to_flatview needs RCU lock · f77c2312

由 Paolo Bonzini 提交于 3月 05, 2018

address_space_read is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_read_full to address_space_read's constant size
fast path and address_space_read_full.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b2a44fca)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f77c2312

address_space_write: address_space_to_flatview needs RCU lock · df04d1f1

由 Paolo Bonzini 提交于 3月 05, 2018

address_space_write is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_write to address_space_write.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4c6ebbb3)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

df04d1f1

memory: inline some performance-sensitive accessors · ac25a325

由 Paolo Bonzini 提交于 3月 05, 2018

These accessors are called from inlined functions, and the call sequence
is much more expensive than just inlining the access.  Move the
struct declaration to memory-internal.h so that exec.c and memory.c
can both use an inline function.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 785a507e)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

ac25a325

openpic_kvm: drop address_space_to_flatview call · 8e8b7399

由 Paolo Bonzini 提交于 3月 05, 2018

The MemoryListener is registered on address_space_memory, there is
not much to assert.  This currently works because the callback
is invoked only once when the listener is registered, but section->fv
is the _new_ FlatView, not the old one on later calls and that
would break.

This confines address_space_to_flatview to exec.c and memory.c.
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 80d2b933)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

8e8b7399

sparc: fix leon3 casa instruction when MMU is disabled · 49921182

由 KONRAD Frederic 提交于 3月 02, 2018

Since the commit af7a06ba:
`casa [..](10), .., ..` (and probably others alternate space instructions)
triggers a data access exception when the MMU is disabled.

When we enter get_asi(...) dc->mem_idx is set to MMU_PHYS_IDX when the MMU
is disabled. Just keep mem_idx unchanged in this case so we passthrough the
MMU when it is disabled.
Signed-off-by: NKONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit 6e10f37c)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

49921182

linux-user: fix target_mprotect/target_munmap error return values · e1f5a04d

由 Max Filippov 提交于 2月 28, 2018

target_mprotect/target_munmap return value goes through get_errno at the
call site, thus the functions must either set errno to host error code
and return -1 or return negative guest error code. Do the latter.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
Reviewed-by: NLaurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-8-jcmvbkbc@gmail.com>
Signed-off-by: NLaurent Vivier <laurent@vivier.eu>
(cherry picked from commit 78cf3390)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

e1f5a04d

linux-user: fix assertion in shmdt · 8fc971ed

由 Max Filippov 提交于 2月 28, 2018

shmdt fails to call mmap_lock/mmap_unlock around page_set_flags,
resulting in the following assertion:
  page_set_flags: Assertion `have_mmap_lock()' failed.

Wrap shmdt internals into mmap_lock/mmap_unlock.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
Reviewed-by: NLaurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-7-jcmvbkbc@gmail.com>
Signed-off-by: NLaurent Vivier <laurent@vivier.eu>
(cherry picked from commit 3c5f6a5f)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

8fc971ed

linux-user: fix mmap/munmap/mprotect/mremap/shmat · 1801fabd

由 Max Filippov 提交于 3月 07, 2018

In linux-user QEMU that runs for a target with TARGET_ABI_BITS bigger
than L1_MAP_ADDR_SPACE_BITS an assertion in page_set_flags fires when
mmap, munmap, mprotect, mremap or shmat is called for an address outside
the guest address space. mmap and mprotect should return ENOMEM in such
case.

Change definition of GUEST_ADDR_MAX to always be the last valid guest
address. Account for this change in open_self_maps.
Add macro guest_addr_valid that verifies if the guest address is valid.
Add function guest_range_valid that verifies if address range is within
guest address space and does not wrap around. Use that macro in
mmap/munmap/mprotect/mremap/shmat for error checking.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
Reviewed-by: NLaurent Vivier <laurent@vivier.eu>
Message-Id: <20180307215010.30706-1-jcmvbkbc@gmail.com>
Signed-off-by: NLaurent Vivier <laurent@vivier.eu>
(cherry picked from commit ebf9a363)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

1801fabd

target/xtensa: dump correct physical registers · 2ac88347

由 Max Filippov 提交于 2月 28, 2018

xtensa_cpu_dump_state outputs CPU physical registers as is, without
synchronization from current window. That may result in different values
printed for the current window and corresponding physical registers.
Synchronize physical registers from window before dumping.

Cc: qemu-stable@nongnu.org
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit b55b1afd)
 Conflicts:
	target/xtensa/translate.c
* drop context dependencies
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

2ac88347

loader: don't perform overlapping address check for memory region ROM images · 1f463795

由 Mark Cave-Ayland 提交于 2月 23, 2018

All memory region ROM images have a base address of 0 which causes the overlapping
address check to fail if more than one memory region ROM image is present, or an
existing ROM image is loaded at address 0.

Make sure that we ignore the overlapping address check in
rom_check_and_register_reset() if this is a memory region ROM image. In particular
this fixes the "rom: requested regions overlap" error on startup when trying to
run qemu-system-sparc with a -kernel image since commit 74976386: "tcx: switch to
load_image_mr() and remove prom_addr hack".
Suggested-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit ca316c11)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

1f463795

tpm: Set the flags of the CMD_INIT command to 0 · e2fc495b

由 Stefan Berger 提交于 1月 26, 2018

The flags of the CMD_INIT control channel command were not
initialized properly. Fix this and set to 0.
Signed-off-by: NStefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit 30270587)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

e2fc495b

rbd: Fix use after free in qemu_rbd_set_keypairs() error path · 4f818781

由 Kevin Wolf 提交于 2月 16, 2018

If we want to include the invalid option name in the error message, we
can't free the string earlier than that.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
(cherry picked from commit 71c87815)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

4f818781

specs/qcow2: Fix documentation of the compressed cluster descriptor · 6f36bd39

由 Alberto Garcia 提交于 2月 21, 2018

This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:

1) the documentation claims that the cluster descriptor contains the
   number of sectors used to store the compressed data, but what it
   actually contains is the number of sectors *minus one* or, in other
   words, the number of additional sectors after the first one.

2) the width of the fields is incorrectly specified. The number of bits
   used by each field is

      x = 62 - (cluster_bits - 8)   for the offset field
      y = (cluster_bits - 8)        for the size field

   So the offset field's location is [0, x-1], not [0, x] as stated.

3) the size field does not contain the size of the compressed data,
   but rather the number of sectors where that data is stored. The
   compressed data starts at the exact point specified in the offset
   field and ends when there's enough data to produce a cluster of
   decompressed data. Both points can be in the middle of a sector,
   allowing several compressed clusters to be stored next to one
   another, sharing sectors if necessary.

Cc: qemu-stable@nongnu.org
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 156b46de)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

6f36bd39

nbd: Honor server's advertised minimum block size · 0430aa08

由 Eric Blake 提交于 2月 14, 2018

Commit 79ba8c98 (v2.7) changed the setting of request_alignment
to occur only during bdrv_refresh_limits(), rather than at at
bdrv_open() time; but at the time, NBD was unaffected, because
it still used sector-based callbacks, so the block layer
defaulted NBD to use 512 request_alignment.

Later, commit 70c4fb26 (also v2.7) changed NBD to use byte-based
callbacks, without setting request_alignment.  This resulted in
NBD using request_alignment of 1, which works great when the
server supports it (as is the case for qemu-nbd), but falls apart
miserably if the server requires alignment (but only if qemu
actually sends a sub-sector request; qemu-io can do it, but
most qemu operations still perform on sectors or larger).

Even later, the NBD protocol was updated to document that clients
should learn the server's minimum alignment during NBD_OPT_GO;
and recommended that clients should assume a minimum size of 512
unless the server understands NBD_OPT_GO and replied with a smaller
size.  Commit 081dd1fe (v2.10) attempted to do that, by assigning
request_alignment to whatever was learned from the server; but
it has two flaws: the assignment is done during bdrv_open() so
it gets unconditionally wiped out back to 1 during any later
bdrv_refresh_limits(); and the code is not using a default of 512
when the server did not report a minimum size.

Fix these issues by moving the assignment to request_alignment
to the right function, and by using a sane default when the
server does not advertise a minimum size.

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-Id: <20180215032905.27146-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
(cherry picked from commit fd8d372d)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

0430aa08

spapr: make pseries-2.11 the default machine type · 327d4645

由 Greg Kurz 提交于 6月 20, 2018

The spapr capability framework was introduced in QEMU 2.12. It allows
to have an explicit control on how host features are exposed to the
guest. This is especially needed to handle migration between hetero-
geneous hosts (eg, POWER8 to POWER9). It is also used to expose fixes/
workarounds against speculative execution vulnerabilities to guests.
The framework was hence backported to QEMU 2.11.1, especially these
commits:

0fac4aa9 spapr: Add pseries-2.12 machine type
9070f408 spapr: Treat Hardware Transactional Memory (HTM) as an
 optional capability

0fac4aa9 has the confusing effect of making pseries-2.12 the default
machine type for QEMU 2.11.1, instead of the expected pseries-2.11. This
patch changes the default machine back to pseries-2.11.

Unfortunately, 9070f408 enforces the HTM capability for pseries-2.11
to be enabled by default, ie, when not passing cap-htm on the command
line. This breaks several 'make check' testcases that run qemu-system-ppc64
with TCG.

The only sane way to fix this is to adapt the impacted testcases so that
they all pass cap-htm=off in this case. This patch does that as well.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

327d4645

virtio-balloon: unref the memory region before continuing · b940928f

由 Tiwei Bie 提交于 1月 25, 2018

Signed-off-by: NTiwei Bie <tiwei.bie@intel.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b86107ab)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

b940928f

pci-bridge/i82801b11: clear bridge registers on platform reset · c3c44a4f

由 Laszlo Ersek 提交于 2月 07, 2018

The "i82801b11-bridge" device model is a descendant of "base-pci-bridge"
(TYPE_PCI_BRIDGE). However, unlike other similar devices, such as

- pci-bridge,
- pcie-pci-bridge,
- PCIE Root Port,
- xio3130 switch upstream and downstream ports,
- dec-21154-p2p-bridge,
- pbm-bridge,
- xilinx-pcie-root,

"i82801b11-bridge" does not clear the bridge specific registers at
platform reset.

This is a problem because devices on "i82801b11-bridge" continue to
respond to config space cycles after platform reset, when addressed with
the bus number that was previously programmed into the secondary bus
number register of "i82801b11-bridge". This error breaks OVMF's search for
extra (PXB) root buses, for example.

The device class reset method for "i82801b11-bridge" is currently NULL;
set it directly to pci_bridge_reset(), like the last three bridge models
in the above listing do.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: qemu-stable@nongnu.org
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1541839Signed-off-by: NLaszlo Ersek <lersek@redhat.com>
Reviewed-by: NMarcel Apfelbaum <marcel@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ed247f40)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

c3c44a4f

block/ssh: fix possible segmentation fault when .desc is not null-terminated · 82c5191f

由 Murilo Opsfelder Araujo 提交于 1月 05, 2018

This patch prevents a possible segmentation fault when .desc members are checked
against NULL.

The ssh_runtime_opts was added by commit
8a6a8089 ("block/ssh: Use QemuOpts for runtime
options").

This fix was inspired by
http://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg00883.html.

Fixes: 8a6a8089 ("block/ssh: Use QemuOpts for runtime options")
Cc: Max Reitz <mreitz@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: NMurilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>
(cherry picked from commit fbd5c4c0)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

82c5191f

22 5月, 2018 12 次提交

spapr: register dummy ICPs later · 72cc467a

由 Greg Kurz 提交于 2月 27, 2018

Some older machine types create more ICPs than needed. We hence
need to register up to xics_max_server_number() dummy ICPs to
accomodate the migration of these machine types.

Recent VSMT rework changed xics_max_server_number() to return

    DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads)

instead of

    DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);

The change is okay but it requires spapr->vsmt to be set, which
isn't the case with the current code. This causes the formula to
return zero and we don't create dummy ICPs. This breaks migration
of older guests as reported here:

    https://bugzilla.redhat.com/show_bug.cgi?id=1549087

The dummy ICP workaround doesn't really have a dependency on XICS
itself. But it does depend on proper VCPU id numbering and it must
be applied before creating vCPUs (ie, creating real ICPs). So this
patch moves the workaround to spapr_init_cpus(), which already
assumes VSMT to be set.

Fixes: 72194664 ("spapr: use spapr->vsmt to compute VCPU ids")
Reported-by: NLukas Doktor <ldoktor@redhat.com>
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 72fdd4de)
Signed-off-by: NGreg Kurz <groug@kaod.org>

72cc467a

spapr: fix missing CPU core nodes in DT when running with TCG · 184da186

由 Greg Kurz 提交于 2月 16, 2018

Commit 5d0fb150 "spapr: consolidate the VCPU id numbering logic
in a single place" introduced a helper to detect thread0 of a virtual
core based on its VCPU id. This is used to create CPU core nodes in
the DT, but it is broken in TCG.

$ qemu-system-ppc64 -nographic -accel tcg -machine dumpdtb=dtb.bin \
                    -smp cores=16,maxcpus=16,threads=1
$ dtc -f -O dts dtb.bin | grep POWER8
                PowerPC,POWER8@0 {
                PowerPC,POWER8@8 {

instead of the expected 16 cores that we get with KVM:

$ dtc -f -O dts dtb.bin | grep POWER8
                PowerPC,POWER8@0 {
                PowerPC,POWER8@8 {
                PowerPC,POWER8@10 {
                PowerPC,POWER8@18 {
                PowerPC,POWER8@20 {
                PowerPC,POWER8@28 {
                PowerPC,POWER8@30 {
                PowerPC,POWER8@38 {
                PowerPC,POWER8@40 {
                PowerPC,POWER8@48 {
                PowerPC,POWER8@50 {
                PowerPC,POWER8@58 {
                PowerPC,POWER8@60 {
                PowerPC,POWER8@68 {
                PowerPC,POWER8@70 {
                PowerPC,POWER8@78 {

This happens because spapr_get_vcpu_id() maps VCPU ids to
cs->cpu_index in TCG mode. This confuses the code in
spapr_is_thread0_in_vcore(), since it assumes thread0 VCPU
ids to have a spapr->vsmt spacing.

    spapr_get_vcpu_id(cpu) % spapr->vsmt == 0

Actually, there's no real reason to expose cs->cpu_index instead
of the VCPU id, since we also generate it with TCG. Also we already
set it explicitly in spapr_set_vcpu_id(), so there's no real reason
either to call kvm_arch_vcpu_id() with KVM.

This patch unifies spapr_get_vcpu_id() to always return the computed
VCPU id both in TCG and KVM. This is one step forward towards KVM<->TCG
migration.

Fixes: 5d0fb150Reported-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit b1a568c1)
Signed-off-by: NGreg Kurz <groug@kaod.org>

184da186

spapr: consolidate the VCPU id numbering logic in a single place · e676038e

由 Greg Kurz 提交于 2月 14, 2018

Several places in the code need to calculate a VCPU id:

    (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
    (core_id / smp_threads) * spapr->vsmt (1 user)
    index * spapr->vsmt (2 users)

or guess that the VCPU id of a given VCPU is the first thread of a virtual
core:

    index % spapr->vsmt != 0

Even if the numbering logic isn't that complex, it is rather fragile to
have these assumptions open-coded in several places. FWIW this was
proved with recent issues related to VSMT.

This patch moves the VCPU id formula to a single function to be called
everywhere the code needs to compute one. It also adds an helper to
guess if a VCPU is the first thread of a VCORE.
Signed-off-by: NGreg Kurz <groug@kaod.org>
[dwg: Rename spapr_is_vcore() to spapr_is_thread0_in_vcore() for clarity]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 5d0fb150)
Signed-off-by: NGreg Kurz <groug@kaod.org>

e676038e

spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() · 094706e6

由 Greg Kurz 提交于 2月 14, 2018

The spapr_vcpu_id() function is an accessor actually. Let's rename it
for symmetry with the recently added spapr_set_vcpu_id() helper.

The motivation behind this is that a later patch will consolidate
the VCPU id formula in a function and spapr_vcpu_id looks like an
appropriate name.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 14bb4486)
Signed-off-by: NGreg Kurz <groug@kaod.org>

094706e6

target/ppc: Clarify compat mode max_threads value · 8c0ec3c3

由 David Gibson 提交于 1月 15, 2018

We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table.  It's all about guest expectations, not host
expectations or support (that's handled elsewhere).

In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
(cherry picked from commit abbc1247)
Signed-off-by: NGreg Kurz <groug@kaod.org>

8c0ec3c3

spapr: move VCPU calculation to core machine code · a8074a58

由 Greg Kurz 提交于 2月 14, 2018

The VCPU ids are currently computed and assigned to each individual
CPU threads in spapr_cpu_core_realize(). But the numbering logic
of VCPU ids is actually a machine-level concept, and many places
in hw/ppc/spapr.c also have to compute VCPU ids out of CPU indexes.

The current formula used in spapr_cpu_core_realize() is:

    vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

where:

    cc->core_id is a multiple of smp_threads
    cpu_index = cc->core_id + i
    0 <= i < smp_threads

So we have:

    cpu_index % smp_threads == i
    cc->core_id / smp_threads == cpu_index / smp_threads

hence:

    vcpu_id =
        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;

This formula was used before VSMT at the time VCPU ids where computed
at the target emulation level. It has the advantage of being useable
to derive a VPCU id out of a CPU index only. It is fitted for all the
places where the machine code has to compute a VCPU id.

This patch introduces an accessor to set the VCPU id in a PowerPCCPU object
using the above formula. It is a first step to consolidate all the VCPU id
logic in a single place.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 648edb64)
Signed-off-by: NGreg Kurz <groug@kaod.org>

a8074a58

spapr: use spapr->vsmt to compute VCPU ids · cc86c467

由 Greg Kurz 提交于 2月 14, 2018

Since the introduction of VSMT in 2.11, the spacing of VCPU ids
between cores is controllable through a machine property instead
of being only dictated by the SMT mode of the host:

    cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

Until recently, the machine code would try to change the SMT mode
of the host to be equal to VSMT or exit. This allowed the rest of
the code to assume that kvmppc_smt_threads() == spapr->vsmt is
always true.

Recent commit "8904e5a7 spapr: Adjust default VSMT value for
better migration compatibility" relaxed the rule. If the VSMT
mode cannot be set in KVM for some reasons, but the requested
CPU topology is compatible with the current SMT mode, then we
let the guest run with  kvmppc_smt_threads() != spapr->vsmt.

This breaks quite a few places in the code, in particular when
calculating DRC indexes.

This is what happens on a POWER host with subcores-per-core=2 (ie,
supports up to SMT4) when passing the following topology:

    -smp threads=4,maxcpus=16 \
    -device host-spapr-cpu-core,core-id=4,id=core1 \
    -device host-spapr-cpu-core,core-id=8,id=core2

qemu-system-ppc64: warning: Failed to set KVM's VSMT mode to 8 (errno -22)

This is expected since KVM is limited to SMT4, but the guest is started
anyway because this topology can run on SMT4 even with a VSMT8 spacing.

But when we look at the DT, things get nastier:

cpus {
        ...
        ibm,drc-indexes = <0x4 0x10000000 0x10000004 0x10000008 0x1000000c>;

This means that we have the following association:

 CPU core device |     DRC    | VCPU id
-----------------+------------+---------
   boot core     | 0x10000000 | 0
   core1         | 0x10000004 | 4
   core2         | 0x10000008 | 8
   core3         | 0x1000000c | 12

But since the spacing of VCPU ids is 8, the DRC for core1 points to a
VCPU that doesn't exist, the DRC for core2 points to the first VCPU of
core1 and and so on...

        ...

        PowerPC,POWER8@0 {
                ...
                ibm,my-drc-index = <0x10000000>;
                ...
        };

        PowerPC,POWER8@8 {
                ...
                ibm,my-drc-index = <0x10000008>;
                ...
        };

        PowerPC,POWER8@10 {
                ...

No ibm,my-drc-index property for this core since 0x10000010 doesn't
exist in ibm,drc-indexes above.

                ...
        };
};

...

interrupt-controller {
        ...
        ibm,interrupt-server-ranges = <0x0 0x10>;

With a spacing of 8, the highest VCPU id for the given topology should be:
        16 * 8 / 4 = 32 and not 16

        ...
        linux,phandle = <0x7e7323b8>;
        interrupt-controller;
};

And CPU hot-plug/unplug is broken:

(qemu) device_del core1
pseries-hotplug-cpu: Cannot find CPU (drc index 10000004) to remove

(qemu) device_del core2
cpu 4 (hwid 8) Ready to die...
cpu 5 (hwid 9) Ready to die...
cpu 6 (hwid 10) Ready to die...
cpu 7 (hwid 11) Ready to die...

These are the VCPU ids of core1 actually

(qemu) device_add host-spapr-cpu-core,core-id=12,id=core3
(qemu) device_del core3
pseries-hotplug-cpu: Cannot find CPU (drc index 1000000c) to remove

This patches all the code in hw/ppc/spapr.c to assume the VSMT
spacing when manipulating VCPU ids.

Fixes: 8904e5a7Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

(cherry picked from commit 72194664)
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NGreg Kurz <groug@kaod.org>

cc86c467

spapr: set vsmt to MAX(8, smp_threads) · c30b366d

由 Laurent Vivier 提交于 2月 09, 2018

We ignore silently the value of smp_threads when we set
the default VSMT value, and if smp_threads is greater than VSMT
kernel is going into trouble later.

Fixes: 8904e5a7
("spapr: Adjust default VSMT value for better migration compatibility")
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 4ad64cbd)
Signed-off-by: NGreg Kurz <groug@kaod.org>

c30b366d

spapr: Adjust default VSMT value for better migration compatibility · af65bce4

由 David Gibson 提交于 1月 15, 2018

fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core.  This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.

The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest.  This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.

Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things.  Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.

Long term we really want to make vsmt == smp_threads for sufficiently
new machine types.  However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.

In the meantime we need some default that will work as often as
possible.  This patch changes that default to 8 in all circumstances.
This does change guest visible behaviour (including for existing
machine versions) for many cases - just not the most common/important
case.

Following is case by case justification for why this is still the least
worst option.  Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.

KVM HV on POWER8 host:
   This is the overwhelmingly common case in production setups, and is
   unchanged by design.  POWER8 hosts will advertise a default VSMT mode
   of 8, and > 8 vthreads/vcore isn't permitted

KVM HV on POWER7 host:
   Will break, but POWER7s allowing KVM were never released to the public.

KVM HV on POWER9 host:
   Not yet released to the public, breaking this now will reduce other
   breakage later.

KVM HV on PowerPC 970:
   Will theoretically break it, but it was barely supported to begin with
   and already required various user visible hacks to work.  Also so old
   that I just don't care.

TCG:
   This is the nastiest one; it means migration of TCG guests (without
   manual vsmt setting) will break.  Since TCG is rarely used in production
   I think this is worth it for the other benefits.  It does also remove
   one more barrier to TCG<->KVM migration which could be interesting for
   debugging applications.

KVM PR:
   As with TCG, this will break migration of existing configurations,
   without adding extra manual vsmt options.  As with TCG, it is rare in
   production so I think the benefits outweigh breakages.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
(cherry picked from commit 8904e5a7)
Signed-off-by: NGreg Kurz <groug@kaod.org>

af65bce4

spapr: Allow some cases where we can't set VSMT mode in the kernel · ad484114

由 David Gibson 提交于 1月 16, 2018

At present if we require a vsmt mode that's not equal to the kernel's
default, and the kernel doesn't let us change it (e.g. because it's an old
kernel without support) then we always fail.

But in fact we can cope with the kernel having a different vsmt as long as
  a) it's >= the actual number of vthreads/vcore (so that guest threads
     that are supposed to be on the same core act like it)
  b) it's a submultiple of the requested vsmt mode (so that guest threads
     spaced by the vsmt value will act like they're on different cores)

Allowing this case gives us a bit more freedom to adjust the vsmt behaviour
without breaking existing cases.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Tested-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NGreg Kurz <groug@kaod.org>
(cherry picked from commit 1f20f2e0)
Signed-off-by: NGreg Kurz <groug@kaod.org>

ad484114

sdl: workaround bug in sdl 2.0.8 headers · 0a30cae5

由 Gerd Hoffmann 提交于 3月 07, 2018

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892087Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
Message-id: 20180307154258.9313-1-kraxel@redhat.com
(cherry picked from commit 2ca5c430)
Signed-off-by: NGreg Kurz <groug@kaod.org>

0a30cae5

memfd: fix configure test · 5892b5a9

由 Paolo Bonzini 提交于 11月 28, 2017

Recent glibc added memfd_create in sys/mman.h.  This conflicts with
the definition in util/memfd.c:

    /builddir/build/BUILD/qemu-2.11.0-rc1/util/memfd.c:40:12: error: static declaration of memfd_create follows non-static declaration

Fix the configure test, and remove the sys/memfd.h inclusion since the
file actually does not exist---it is a typo in the memfd_create(2) man
page.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 75e5b70e)
Signed-off-by: NGreg Kurz <groug@kaod.org>

5892b5a9

15 2月, 2018 1 次提交
- M
  Update version for 2.11.1 release · 7c1beb52
  由 Michael Roth 提交于 2月 14, 2018
```
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
```
  7c1beb52
13 2月, 2018 8 次提交

spapr: add missing break in h_get_cpu_characteristics() · 00e9fba2

由 Greg Kurz 提交于 2月 01, 2018

Detected by Coverity (CID 1385702). This fixes the recently added hypercall
to let guests properly apply Spectre and Meltdown workarounds.

Fixes: c59704b2 "target/ppc/spapr: Add H-Call H_GET_CPU_CHARACTERISTICS"
Reported-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit fa86f592)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

00e9fba2

vga: check the validation of memory addr when draw text · 63112b16

由 linzhecheng 提交于 1月 11, 2018

Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
    iopl(3);
    srand(time(NULL));
    int a,b;
    while(1){
	a = rand()%0x100;
	b = 0x3c0 + (rand()%0x20);
        outb(a,b);
    }
    return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.
Signed-off-by: Nlinzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 191f59dc)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

63112b16

input: fix memory leak · 30c3b482

由 linzhecheng 提交于 12月 25, 2017

If kbd_queue is not empty and queue_count >= queue_limit,
we should free evt.

Change-Id: Ieeacf90d5e7e370a40452ec79031912d8b864d83
Signed-off-by: Nlinzhecheng <linzhecheng@huawei.com>
Message-id: 20171225023730.5512-1-linzhecheng@huawei.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fca4774a)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

30c3b482

ui: correctly advance output buffer when writing SASL data · 88ab8538

由 Daniel P. Berrangé 提交于 2月 01, 2018

In this previous commit:

  commit 8f61f1c5
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Dec 18 19:12:20 2017 +0000

    ui: track how much decoded data we consumed when doing SASL encoding

I attempted to fix a flaw with tracking how much data had actually been
processed when encoding with SASL. With that flaw, the VNC server could
mistakenly discard queued data that had not been sent.

The fix was not quite right though, because it merely decremented the
vs->output.offset value. This is effectively discarding data from the
end of the pending output buffer. We actually need to discard data from
the start of the pending output buffer. We also want to free memory that
is no longer required. The correct way to handle this is to use the
buffer_advance() helper method instead of directly manipulating the
offset value.
Reported-by: NLaszlo Ersek <lersek@redhat.com>
Signed-off-by: NDaniel P. Berrangé <berrange@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
Message-id: 20180201155841.27509-1-berrange@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 627ebec2)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

88ab8538

ui: avoid sign extension using client width/height · 64653b7f

由 Daniel P. Berrange 提交于 1月 18, 2018

Pixman returns a signed int for the image width/height, but the VNC
protocol only permits a unsigned int16. Effective framebuffer size
is determined by the guest, limited by the video RAM size, so the
dimensions are unlikely to exceed the range of an unsigned int16,
but this is not currently validated.

With the current use of 'int' for client width/height, the calculation
of offsets in vnc_update_throttle_offset() suffers from integer size
promotion and sign extension, causing coverity warnings

*** CID 1385147:  Integer handling issues  (SIGN_EXTENSION)
/ui/vnc.c: 979 in vnc_update_throttle_offset()
973      * than that the client would already suffering awful audio
974      * glitches, so dropping samples is no worse really).
975      */
976     static void vnc_update_throttle_offset(VncState *vs)
977     {
978         size_t offset =
>>>     CID 1385147:  Integer handling issues  (SIGN_EXTENSION)
>>>     Suspicious implicit sign extension:
    "vs->client_pf.bytes_per_pixel" with type "unsigned char" (8 bits,
    unsigned) is promoted in "vs->client_width * vs->client_height *
    vs->client_pf.bytes_per_pixel" to type "int" (32 bits, signed), then
    sign-extended to type "unsigned long" (64 bits, unsigned).  If
    "vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel"
    is greater than 0x7FFFFFFF, the upper bits of the result will all be 1.
979             vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;

Change client_width / client_height to be a size_t to avoid sign
extension and integer promotion. Then validate that dimensions are in
range wrt the RFB protocol u16 limits.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Message-id: 20180118155254.17053-1-berrange@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 4c956bd8)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

64653b7f

ui: mix misleading comments & return types of VNC I/O helper methods · 9a26ca6b

由 Daniel P. Berrange 提交于 12月 18, 2017

While the QIOChannel APIs for reading/writing data return ssize_t, with negative
value indicating an error, the VNC code passes this return value through the
vnc_client_io_error() method. This detects the error condition, disconnects the
client and returns 0 to indicate error. Thus all the VNC helper methods should
return size_t (unsigned), and misleading comments which refer to the possibility
of negative return values need fixing.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-14-berrange@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 30b80fd5)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

9a26ca6b

ui: add trace events related to VNC client throttling · 172f4e5a

由 Daniel P. Berrange 提交于 12月 18, 2017

The VNC client throttling is quite subtle so will benefit from having trace
points available for live debugging.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-13-berrange@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6aa22a29)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

172f4e5a

ui: place a hard cap on VNC server output buffer size · 0c85a40e

由 Daniel P. Berrange 提交于 12月 18, 2017

The previous patches fix problems with throttling of forced framebuffer updates
and audio data capture that would cause the QEMU output buffer size to grow
without bound. Those fixes are graceful in that once the client catches up with
reading data from the server, everything continues operating normally.

There is some data which the server sends to the client that is impractical to
throttle. Specifically there are various pseudo framebuffer update encodings to
inform the client of things like desktop resizes, pointer changes, audio
playback start/stop, LED state and so on. These generally only involve sending
a very small amount of data to the client, but a malicious guest might be able
to do things that trigger these changes at a very high rate. Throttling them is
not practical as missed or delayed events would cause broken behaviour for the
client.

This patch thus takes a more forceful approach of setting an absolute upper
bound on the amount of data we permit to be present in the output buffer at
any time. The previous patch set a threshold for throttling the output buffer
by allowing an amount of data equivalent to one complete framebuffer update and
one seconds worth of audio data. On top of this it allowed for one further
forced framebuffer update to be queued.

To be conservative, we thus take that throttling threshold and multiply it by
5 to form an absolute upper bound. If this bound is hit during vnc_write() we
forceably disconnect the client, refusing to queue further data. This limit is
high enough that it should never be hit unless a malicious client is trying to
exploit the sever, or the network is completely saturated preventing any sending
of data on the socket.

This completes the fix for CVE-2017-15124 started in the previous patches.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-12-berrange@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f887cf16)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

0c85a40e