提交 · 5e4eb851ddc5c7c23c0c5705fbf917b7b3a20586 · openeuler / qemu

13 12月, 2016 3 次提交

block: Return -ENOTSUP rather than assert on unaligned discards · 5e4eb851

由 Eric Blake 提交于 11月 17, 2016

Right now, the block layer rounds discard requests, so that
individual drivers are able to assert that discard requests
will never be unaligned.  But there are some ISCSI devices
that track and coalesce multiple unaligned requests, turning it
into an actual discard if the requests eventually cover an
entire page, which implies that it is better to always pass
discard requests as low down the stack as possible.

In isolation, this patch has no semantic effect, since the
block layer currently never passes an unaligned request through.
But the block layer already has code that silently ignores
drivers that return -ENOTSUP for a discard request that cannot
be honored (as well as drivers that return 0 even when nothing
was done).  But the next patch will update the block layer to
fragment discard requests, so that clients are guaranteed that
they are either dealing with an unaligned head or tail, or an
aligned core, making it similar to the block layer semantics of
write zero fragmentation.

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 49228d1e)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

5e4eb851

block: Let write zeroes fallback work even with small max_transfer · dd11d33d

由 Eric Blake 提交于 11月 17, 2016

Commit 443668ca rewrote the write_zeroes logic to guarantee that
an unaligned request never crosses a cluster boundary.  But
in the rewrite, the new code assumed that at most one iteration
would be needed to get to an alignment boundary.

However, it is easy to trigger an assertion failure: the Linux
kernel limits loopback devices to advertise a max_transfer of
only 64k.  Any operation that requires falling back to writes
rather than more efficient zeroing must obey max_transfer during
that fallback, which means an unaligned head may require multiple
iterations of the write fallbacks before reaching the aligned
boundaries, when layering a format with clusters larger than 64k
atop the protocol of file access to a loopback device.

Test case:

$ qemu-img create -f qcow2 -o cluster_size=1M file 10M
$ losetup /dev/loop2 /path/to/file
$ qemu-io -f qcow2 /dev/loop2
qemu-io> w 7m 1k
qemu-io> w -z 8003584 2093056

In fairness to Denis (as the original listed author of the culprit
commit), the faulty logic for at most one iteration is probably all
my fault in reworking his idea.  But the solution is to restore what
was in place prior to that commit: when dealing with an unaligned
head or tail, iterate as many times as necessary while fragmenting
the operation at max_transfer boundaries.
Reported-by: NEd Swierk <eswierk@skyportsystems.com>
CC: qemu-stable@nongnu.org
CC: Denis V. Lunev <den@openvz.org>
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit b2f95fee)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

dd11d33d

qcow2: Inform block layer about discard boundaries · c4bf37e0

由 Eric Blake 提交于 11月 17, 2016

At the qcow2 layer, discard is only possible on a per-cluster
basis; at the moment, qcow2 silently rounds any unaligned
requests to this granularity.  However, an upcoming patch will
fix a regression in the block layer ignoring too much of an
unaligned discard request, by changing the block layer to
break up a discard request at alignment boundaries; for that
to work, the block layer must know about our limits.

However, we can't go one step further by changing
qcow2_discard_clusters() to assert that requests are always
aligned, since that helper function is reached on paths
outside of the block layer.

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit ecdbead6)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

c4bf37e0

09 12月, 2016 5 次提交

slirp: Fix access to freed memory · e9dbd28e

由 Samuel Thibault 提交于 11月 13, 2016

if_start() goes through the slirp->if_fastq and slirp->if_batchq
list of pending messages, and accesses ifm->ifq_so->so_nqueued of its
elements if ifm->ifq_so != NULL.  When freeing a socket, we thus need
to make sure that any pending message for this socket does not refer
to the socket any more.
Signed-off-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: NBrian Candler <b.candler@pobox.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ea64d5f0)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

e9dbd28e

vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout · 92230a59

由 Greg Kurz 提交于 11月 04, 2016

With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.

In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.

This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.

This works for legacy mappings as well.

Cc: qemu-stable@nongnu.org
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f1f9e6c5)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

92230a59

block: Don't mark node clean after failed flush · 48b3aa20

由 Kevin Wolf 提交于 11月 05, 2016

Commit 3ff2f67a changed bdrv_co_flush() so that no flush is issues if
the image hasn't been dirtied since the last flush. This is not quite
correct: The condition should be that the image hasn't been dirtied
since the last _successful_ flush. This patch changes the logic
accordingly.

Without this fix, subsequent bdrv_co_flush() calls would return success
without actually doing anything even though the image is still dirty.
The difference is visible in some blkdebug test cases where error
messages incorrectly disappeared after commit 3ff2f67a.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NDenis V. Lunev <den@openvz.org>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Message-id: 1478300595-10090-1-git-send-email-kwolf@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e6af1e08)

Conflicts:
	block/io.c

* remove context dep on 99723548Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

48b3aa20

virtio-net: mark VIRTIO_NET_F_GSO as legacy · f1372d6e

由 Michael S. Tsirkin 提交于 11月 04, 2016

virtio 1.0 spec says this is a legacy feature bit,
hide it from guests in modern mode.

Note: for cross-version migration compatibility,
we keep the bit set in host_features.
The result will be that a guest migrating cross-version
will see host features change under it.
As guests only seem to read it once, this should
not be an issue. Meanwhile, will work to fix guests to
ignore this bit in virtio1 mode, too.

Cc: qemu-stable@nongnu.org
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 2a083ffd)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f1372d6e

virtio: allow per-device-class legacy features · 63087cd7

由 Michael S. Tsirkin 提交于 11月 04, 2016

Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.

Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 9b706dbb)

Conflicts:
	hw/virtio/virtio.c

* drop context dep on ff4c07df
* resolv func dep on ff4c07df creating vdc variable in
  virtio_device_class_init()
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

63087cd7

02 12月, 2016 1 次提交

target-ppc: Fix CPU migration from qemu-2.6 <-> later versions · 9df69dcd

由 David Gibson 提交于 11月 21, 2016

When migration for target-ppc was converted to vmstate, several
VMSTATE_EQUAL() checks were foolishly included of things that really
should be internal state.  Specifically we verified equality of the
insns_flags and insns_flags2 fields, which are used within TCG to
determine which groups of instructions are available on this cpu
model.  Between qemu-2.6 and qemu-2.7 we made some changes to these
classes which broke migration.

This path fixes migration both forwards and backwards.  On migration
from 2.6 to later versions we import the fields into teporary
variables, which we then ignore.  In migration backwards, we populate
the temporary fields from the runtime fields, but mask out the bits
which were added after qemu-2.6, allowing the VMSTATE_EQUAL in
qemu-2.6 to accept the stream.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
(cherry picked from commit 16a2497b)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

9df69dcd

19 11月, 2016 2 次提交

net: fix sending of data with -net socket, listen backend · 95a06380

由 Daniel P. Berrange 提交于 11月 04, 2016

The use of -net socket,listen was broken in the following
commit

  commit 16a3df40
  Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
  Date:   Fri May 13 15:35:19 2016 +0800

    net/net: Add SocketReadState for reuse codes

    This function is from net/socket.c, move it to net.c and net.h.
    Add SocketReadState to make others reuse net_fill_rstate().
    suggestion from jason.

This refactored the state out of NetSocketState into a
separate SocketReadState. This refactoring requires
that a callback is provided to be triggered upon
completion of a packet receive from the guest.

The patch only registered this callback in the codepaths
hit by -net socket,connect, not -net socket,listen. So
as a result packets sent by the guest in the latter case
get dropped on the floor.

This bug is hidden because net_fill_rstate() silently
does nothing if the callback is not set.

This patch adds in the middle callback registration
and also adds an assert so that QEMU aborts if there
are any other codepaths hit which are missing the
callback.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NZhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
(cherry picked from commit e79cd406)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

95a06380

acpi/ipmi: Initialize the fwinfo before fetching it · 1790a9d7

由 Corey Minyard 提交于 10月 24, 2016

The initialization was missed before, resulting in some
bad data in the smbus case.
Signed-off-by: NCorey Minyard <cminyard@mvista.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 698ae42b)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

1790a9d7

03 11月, 2016 29 次提交

memory: Don't use memcpy for ram_device regions · 1b16ded6

由 Alex Williamson 提交于 10月 31, 2016

With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors. If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap. Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion. When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.

This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU. If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap. Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems. I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces. It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.

To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.

With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access. The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities. If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access. A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device. Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).
Reported-by: NThorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4a2e242b)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

1b16ded6

memory: Replace skip_dump flag with "ram_device" · ca83f87a

由 Alex Williamson 提交于 10月 31, 2016

Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that.  If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it.  Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 21e00fa5)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

ca83f87a

net: rtl8139: limit processing of ring descriptors · 2817466c

由 Prasad J Pandit 提交于 10月 21, 2016

RTL8139 ethernet controller in C+ mode supports multiple
descriptor rings, each with maximum of 64 descriptors. While
processing transmit descriptor ring in 'rtl8139_cplus_transmit',
it does not limit the descriptor count and runs forever. Add
check to avoid it.
Reported-by: NAndrew Henderson <hendersa@icculus.org>
Signed-off-by: NPrasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: NJason Wang <jasowang@redhat.com>
(cherry picked from commit c7c35916)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

2817466c

qemu-iotests: Test I/O in a single drive from a throttling group · e389e44a

由 Alberto Garcia 提交于 10月 17, 2016

iotest 093 contains a test that creates a throttling group with
several drives and performs I/O in all of them. This patch adds a new
test that creates a similar setup but only performs I/O in one of the
drives at the same time.

This is useful to test that the round robin algorithm is behaving
properly in these scenarios, and is specifically written using the
regression introduced in 27ccdd52 as an example.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit a26ddb43)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

e389e44a

throttle: Correct access to wrong BlockBackendPublic structures · b1fdc941

由 Alberto Garcia 提交于 10月 17, 2016

In 27ccdd52 the throttling fields were
moved from BlockDriverState to BlockBackend. However in a few cases
the code started using throttling fields from the active BlockBackend
instead of the round-robin token, making the algorithm behave
incorrectly.

This can cause starvation if there's a throttling group with several
drives but only one of them has I/O.

Cc: qemu-stable@nongnu.org
Reported-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6bf77e1c)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

b1fdc941

ppc/kvm: Mark 64kB page size support as disabled if not available · 61781984

由 Thomas Huth 提交于 9月 21, 2016

QEMU currently refuses to start with KVM-PR and only prints out

	qemu: fatal: Unknown MMU model 851972

when being started there. This is because commit 4322e8ce
("ppc: Fix 64K pages support in full emulation") introduced a new
POWERPC_MMU_64K bit to indicate support for this page size, but
it never gets cleared on KVM-PR if the host kernel does not support
this. Thus we've got to turn off this bit in the mmu_model for KVM-PR.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 0d594f55)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

61781984

rbd: shift byte count as a 64-bit value · 857efecf

由 Paolo Bonzini 提交于 10月 10, 2016

Otherwise, reads of more than 2GB fail.  Until commit
7bbca9e2, reads of 2^41
bytes succeeded at least theoretically.

In fact, pdiscard ought to receive a 64-bit integer as the
count for the same reason.

Reported by Coverity.

Fixes: 7bbca9e2
Cc: qemu-stable@nongnu.org
Cc: kwolf@redhat.com
Cc: eblake@redhat.com
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e948f663)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

857efecf

tests/test-qmp-input-strict: Cover missing struct members · 99837b0d

由 Markus Armbruster 提交于 10月 04, 2016

These tests would have caught the bug fixed by the previous commit.
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Message-Id: <1475594630-24758-1-git-send-email-armbru@redhat.com>
(cherry picked from commit bce3035a)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

99837b0d

qapi: Fix crash when 'any' or 'null' parameter is missing · b34c7bd4

由 Marc-André Lureau 提交于 9月 23, 2016

Unlike the other visit methods, visit_type_any() and visit_type_null()
neglect to check whether qmp_input_get_object() succeeded.  They crash
when it fails.  Reproducer:

{ "execute": "qom-set",
  "arguments": { "path": "/machine", "property": "rtc-time" } }

Will crash with:

qapi/qapi-visit-core.c:277: visit_type_any: Assertion `!err != !*obj'
failed

Broken in commit 5c678ee8.  Fix by adding the missing error checks.
Signed-off-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20160922203927.28241-3-marcandre.lureau@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
[Commit message rephrased]
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>

(cherry picked from commit c4897802)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

b34c7bd4

qmp: fix object-add assert() without props · f3467f56

由 Marc-André Lureau 提交于 9月 23, 2016

Since commit ad739706, user_creatable_add_type() expects to be
given a qdict. However, if object-add is called without props, you reach
the assert: "qemu/qom/object_interfaces.c:115: user_creatable_add_type:
Assertion `qdict' failed.", because the qdict isn't created in this
case (it's optional).

Furthermore, qmp_input_visitor_new() is not meant to be called without a
dict, and a further commit will assert in this situation.

If none given, create an empty qdict in qmp to avoid the
user_creatable_add_type() assert(qdict).
Signed-off-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20160922203927.28241-2-marcandre.lureau@redhat.com>
Tested-by: NXiao Long Jiang <zxiaol@linux.vnet.ibm.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
(cherry picked from commit e64c75a9)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f3467f56

char: fix missing return in error path for chardev TLS init · 5be53356

由 Daniel P. Berrange 提交于 9月 30, 2016

If the qio_channel_tls_new_(server|client) methods fail,
we disconnect the client. Unfortunately a missing return
means we then go on to try and run the TLS handshake on
a NULL I/O channel. This gives predictably segfaulty
results.

The main way to trigger this is to request a bogus TLS
priority string for the TLS credentials. e.g.

  -object tls-creds-x509,id=tls0,priority=wibble,...

Most other ways appear impossible to trigger except
perhaps if OOM conditions cause gnutls initialization
to fail.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 660a2d83)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

5be53356

qht: fix unlock-after-free segfault upon resizing · af29bd31

由 Emilio G. Cota 提交于 10月 05, 2016

The old map's bucket locks are being unlocked *after*
that same old map has been passed to RCU for destruction.
This is a bug that can cause a segfault, since there's
no guarantee that the deletion will be deferred (e.g.
there may be no concurrent readers).

The segfault is easily triggered in RHEL6/CentOS6 with qht-test,
particularly on a single-core system or by pinning qht-test
to a single core.

Fix it by unlocking the map's bucket locks right after having
published the new map, and (crucially) before marking the map
for deletion via call_rcu().

While at it, expand qht_do_resize() to atomically do (1) a reset,
(2) a resize, or (3) a reset+resize. This simplifies the calling
code, since the new function (qht_do_resize_reset()) acquires
and releases the buckets' locks.

Note that no qht_do_reset inline is provided, since it would have
no users--qht_reset() already performs a reset without taking
ht->lock.
Reported-by: NPeter Maydell <peter.maydell@linaro.org>
Reported-by: NDaniel P. Berrange <berrange@redhat.com>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1475706880-10667-3-git-send-email-cota@braap.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 76b553b3)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

af29bd31

qht: simplify qht_reset_size · f72ca1ac

由 Emilio G. Cota 提交于 10月 05, 2016

Sometimes gcc doesn't pick up the fact that 'new' is properly
set if 'resize == true', which may generate an unnecessary
build warning.

Fix it by removing 'resize' and directly checking that 'new'
is non-NULL.
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1475706880-10667-2-git-send-email-cota@braap.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f555a9d0)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f72ca1ac

migrate: Fix cpu-throttle-increment regression in HMP · 4d45fe11

由 Eric Blake 提交于 9月 08, 2016

Commit 69ef1f36 accidentally broke migrate_set_parameter's ability
to set the cpu-throttle-increment to anything other than the
default, because it forgot to parse the user's string into an
integer.

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: NJuan Quintela <quintela@redhat.com>
Signed-off-by: NJuan Quintela <quintela@redhat.com>
(cherry picked from commit bb2b777c)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

4d45fe11

block-backend: remove blk_flush_all · 4a25ab2a

由 John Snow 提交于 9月 22, 2016

We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.

The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Acked-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 49137bf6)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

4a25ab2a

qemu: use bdrv_flush_all for vm_stop et al · 95200ebb

由 John Snow 提交于 9月 22, 2016

Reimplement bdrv_flush_all for vm_stop. In contrast to blk_flush_all,
bdrv_flush_all does not have device model restrictions. This allows
us to flush and halt unconditionally without error.

This allows us to do things like migrate when we have a device with
an open tray, but has a node that may need to be flushed, or nodes
that aren't currently attached to any device and need to be flushed.

Specifically, this allows us to migrate when we have a CDROM with
an open tray.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Acked-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 22af08ea)
Conflicts:
	cpus.c

* drop context dependancy on 6d0ceb80Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

95200ebb

block: reintroduce bdrv_flush_all · 8e945125

由 John Snow 提交于 9月 22, 2016

Commit fe1a9cbc moved the flush_all routine from the bdrv layer to the
block-backend layer. In doing so, however, the semantics of the routine
changed slightly such that flush_all now used blk_flush instead of
bdrv_flush.

blk_flush can fail if the attached device model reports that it is not
"available," (i.e. the tray is open.) This changed the semantics of
flush_all such that it can now fail for e.g. open CDROM drives.

Reintroduce bdrv_flush_all to regain the old semantics without having to
alter the behavior of blk_flush or blk_flush_all, which are already
'doing the right thing.'
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Acked-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4085f5c7)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

8e945125

iscsi: Fix divide-by-zero regression on raw SG devices · d40d148f

由 Eric Blake 提交于 9月 07, 2016

When qemu uses iscsi devices in sg mode, iscsilun->block_size
is left at 0.  Prior to commits cf081fca and similar, when
block limits were tracked in sectors, this did not matter:
various block limits were just left at 0.  But when we started
scaling by block size, this caused SIGFPE.

Then, in a later patch, commit a5b8dd2c added an assertion to
bdrv_open_common() that request_alignment is always non-zero;
which was not true for SG mode.  Rather than relax that assertion,
we can just provide a sane value (we don't know of any SG device
with a block size smaller than qemu's default sizing of 512 bytes).

One possible solution for SG mode is to just blindly skip ALL
of iscsi_refresh_limits(), since we already short circuit so
many other things in sg mode.  But this patch takes a slightly
more conservative approach, and merely guarantees that scaling
will succeed, while still using multiples of the original size
where possible.  Resulting limits may still be zero in SG mode
(that is, we mostly only fix block_size used as a denominator
or which affect assertions, not all uses).
Reported-by: NHolger Schranz <holger@fam-schranz.de>
Signed-off-by: NEric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org

Message-Id: <1473283640-15756-1-git-send-email-eblake@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 95eaa785)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

d40d148f

qcow2: fix encryption during cow of sectors · f9856029

由 Daniel P. Berrange 提交于 9月 06, 2016

Broken in previous commit:

  commit aaa4d20b
  Author: Kevin Wolf <kwolf@redhat.com>
  Date:   Wed Jun 1 15:21:05 2016 +0200

      qcow2: Make copy_sectors() byte based

The copy_sectors() code was originally using the 'sector'
parameter for encryption, which was passed in by the caller
from the QCowL2Meta.offset field (aka the guest logical
offset).

After the change, the code is using 'cluster_offset' which
was passed in from QCow2L2Meta.alloc_offset field (aka the
host physical offset).

This would cause the data to be encrypted using an incorrect
initialization vector which will in turn cause later reads
to return garbage.

Although current qcow2 built-in encryption is blocked from
usage in the emulator, one could still hit this if writing
to the file via qemu-{img,io,nbd} commands.

Cc: qemu-stable@nongnu.org
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit bb9f8dd0)
Conflicts:
	tests/qemu-iotests/group

* drop context dependancy on non-2.7 iotest groups
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f9856029

vfio/pci: Fix regression in MSI routing configuration · a3a25455

由 David Gibson 提交于 9月 15, 2016

d1f6af6a "kvm-irqchip: simplify kvm_irqchip_add_msi_route" was a cleanup
of kvmchip routing configuration, that was mostly intended for x86.
However, it also contains a subtle change in behaviour which breaks EEH[1]
error recovery on certain VFIO passthrough devices on spapr guests.  So far
it's only been seen on a BCM5719 NIC on a POWER8 server, but there may be
other hardware with the same problem.  It's also possible there could be
circumstances where it causes a bug on x86 as well, though I don't know of
any obvious candidates.

Prior to d1f6af6a, both vfio_msix_vector_do_use() and
vfio_add_kvm_msi_virq() used msg == NULL as a special flag to mark this
as the "dummy" vector used to make the host hardware state sync with the
guest expected hardware state in terms of MSI configuration.

Specifically that flag caused vfio_add_kvm_msi_virq() to become a no-op,
meaning the dummy irq would always be delivered via qemu. d1f6af6a changed
vfio_add_kvm_msi_virq() so it takes a vector number instead of the msg
parameter, and determines the correct message itself.  The test for !msg
was removed, and not replaced with anything there or in the caller.

With an spapr guest which has a VFIO device, if an EEH error occurs on the
host hardware, then the device will be isolated then reset.  This is a
combination of host and guest action, mediated by some EEH related
hypercalls.  I haven't fully traced the mechanics, but somehow installing
the kvm irqchip route for the dummy irq on the BCM5719 means that after EEH
reset and recovery, at least some irqs are no longer delivered to the
guest.

In particular, the guest never gets the link up event, and so the NIC is
effectively dead.

[1] EEH (Enhanced Error Handling) is an IBM POWER server specific PCI-*
    error reporting and recovery mechanism.  The concept is somewhat
    similar to PCI-E AER, but the details are different.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1373802

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Gavin Shan <gwshan@au1.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: qemu-stable@nongnu.org
Fixes: d1f6af6a ("kvm-irqchip: simplify kvm_irqchip_add_msi_route")
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 6d17a018)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

a3a25455

s390x/css: handle cssid 255 correctly · 533dedf0

由 Cornelia Huck 提交于 8月 15, 2016

The cssid 255 is reserved but still valid from an architectural
point of view. However, feeding a bogus schid of 0xffffffff into
the virtio hypercall will lead to a crash:

Stack trace of thread 138363:
        #0  0x00000000100d168c css_find_subch (qemu-system-s390x)
        #1  0x00000000100d3290 virtio_ccw_hcall_notify
        #2  0x00000000100cbf60 s390_virtio_hypercall
        #3  0x000000001010ff7a handle_hypercall
        #4  0x0000000010079ed4 kvm_cpu_exec (qemu-system-s390x)
        #5  0x00000000100609b4 qemu_kvm_cpu_thread_fn
        #6  0x000003ff8b887bb4 start_thread (libpthread.so.0)
        #7  0x000003ff8b78df0a thread_start (libc.so.6)

This is because the css array was only allocated for 0..254
instead of 0..255.

Let's fix this by bumping MAX_CSSID to 255 and fencing off the
reserved cssid of 255 during css image allocation.
Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 882b3b97)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

533dedf0

ahci: clear aiocb in ncq_cb · 54c26b73

由 John Snow 提交于 9月 26, 2016

Similar to existing fixes for IDE (87ac25fd) and ATAPI (7f951b2d), the
AIOCB must be cleared in the callback. Otherwise, we may accidentally
try to reset a dangling pointer in bdrv_aio_cancel() from a port reset.
Signed-off-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Message-id: 1474575040-32079-2-git-send-email-jsnow@redhat.com
Signed-off-by: NJohn Snow <jsnow@redhat.com>
(cherry picked from commit df403bc5)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

54c26b73

virtio-scsi: Don't abort when media is ejected · f5436d1d

由 Fam Zheng 提交于 9月 14, 2016

With an ejected block backend, blk_get_aio_context() would return
qemu_aio_context. In this case don't assert.
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-Id: <1473848224-24809-3-git-send-email-famz@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2a2d69f4)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f5436d1d

scsi-disk: Cleaning up around tray open state · 3550eeaf

由 Fam Zheng 提交于 9月 14, 2016

Even if tray is not open, it can be empty (blk_is_inserted() == false).
Handle both cases correctly by replacing the s->tray_open checks with
blk_is_available(), which is an AND of the two.

Also simplify successive checks of them into blk_is_available(), in a
couple cases.
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-Id: <1473848224-24809-2-git-send-email-famz@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cd723b85)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

3550eeaf

iothread: Stop threads before main() quits · 316c2c94

由 Fam Zheng 提交于 9月 08, 2016

Right after main_loop ends, we release various things but keep iothread
alive. The latter is not prepared to the sudden change of resources.

Specifically, after bdrv_close_all(), virtio-scsi dataplane get a
surprise at the empty BlockBackend:

(gdb) bt
    at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:543
    at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:577

It is because the d->conf.blk->root is set to NULL, then
blk_get_aio_context() returns qemu_aio_context, whereas s->ctx is still
pointing to the iothread:

    hw/scsi/virtio-scsi.c:543:

    if (s->dataplane_started) {
        assert(blk_get_aio_context(d->conf.blk) == s->ctx);
    }

To fix this, let's stop iothreads before doing bdrv_close_all().

Cc: qemu-stable@nongnu.org
Signed-off-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 1473326931-9699-1-git-send-email-famz@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit dce8921b)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

316c2c94

crypto: ensure XTS is only used with ciphers with 16 byte blocks · 98b4465f

由 Daniel P. Berrange 提交于 8月 24, 2016

The XTS cipher mode needs to be used with a cipher which has
a block size of 16 bytes. If a mis-matching block size is used,
the code will either corrupt memory beyond the IV array, or
not fully encrypt/decrypt the IV.

This fixes a memory corruption crash when attempting to use
cast5-128 with xts, since the former has an 8 byte block size.

A test case is added to ensure the cipher creation fails with
such an invalid combination.
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
(cherry picked from commit a5d2f44d)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

98b4465f

scsi: mptconfig: fix misuse of MPTSAS_CONFIG_PACK · 8342e124

由 Paolo Bonzini 提交于 8月 29, 2016

These issues cause respectively a QEMU crash and a leak of 2 bytes of
stack.  They were discovered by VictorV of 360 Marvel Team.
Reported-by: NTom Victor <i-tangtianwen@360.cm>
Cc: qemu-stable@nongnu.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 65a8e1f6)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

8342e124

scsi: mptconfig: fix an assert expression · 0b6ab253

由 Prasad J Pandit 提交于 8月 31, 2016

When LSI SAS1068 Host Bus emulator builds configuration page
headers, mptsas_config_pack() should assert that the size
fits in a byte.  However, the size is expressed in 32-bit
units, so up to 1020 bytes fit.  The assertion was only
allowing replies up to 252 bytes, so fix it.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPrasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1472645167-30765-2-git-send-email-ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cf2bce20)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

0b6ab253

vmw_pvscsi: check page count while initialising descriptor rings · 74288657

由 Prasad J Pandit 提交于 8月 31, 2016

Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These descriptors come with their ring
buffers. A guest could set the page count for these rings to
an arbitrary value, leading to infinite loop or OOB access.
Add check to avoid it.
Reported-by: NTom Victor <vv474172261@gmail.com>
Signed-off-by: NPrasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1472626169-12989-1-git-send-email-ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7f61f469)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

74288657