提交 · 066ae4f829bcc6b8c98994a7c22fe570d500d548 · openeuler / qemu

29 5月, 2017 8 次提交

ehci: fix frame timer invocation. · 3bfecee2

由 Gerd Hoffmann 提交于 5月 19, 2017

ehci registers ehci_frame_timer as both timer and bottom half, which
turned out to be a bad idea as it can be called as bottom half then
while it is running as timer, and it isn't prepared to handle recursive
calls.

Change the timer func to just schedule the bottom half to avoid this.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1449609Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Message-id: 20170519120428.25981-1-kraxel@redhat.com

3bfecee2

usb: don't wakeup during coldplug · 26022652

由 Gerd Hoffmann 提交于 5月 23, 2017

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1452512Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Message-id: 20170523084635.20062-1-kraxel@redhat.com

26022652

usb-hub: set PORT_STAT_C_SUSPEND on host-initiated wake-up · 6361bbc7

由 Ladi Prosek 提交于 5月 22, 2017

PORT_STAT_C_SUSPEND should be set even on host-initiated wake-up,
i.e. on ClearPortFeature(PORT_SUSPEND). Windows is known to not
work properly otherwise.

Side note, since PORT_ENABLE looks similar and might appear to
have the same issue: According to 11.24.2.7.2.2 C_PORT_ENABLE:

  "This bit is set when the PORT_ENABLE bit changes from one to
  zero as a result of a Port Error condition (see Section 11.8.1).
  This bit is not set on any other changes to PORT_ENABLE."
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Message-id: 20170522123325.2199-1-lprosek@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>

6361bbc7

xhci: add CONFIG_USB_XHCI_NEC option · 2da077a8

由 Gerd Hoffmann 提交于 5月 17, 2017

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451189Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Message-id: 20170517103313.8459-2-kraxel@redhat.com

2da077a8

xhci: split into multiple files · 0bbb2f3d

由 Gerd Hoffmann 提交于 5月 17, 2017

Moved structs and defines to hcd-xhci.h.
Move nec controller variant to hcd-xhci-nec.c.
No functional changes.
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Message-id: 20170517103313.8459-1-kraxel@redhat.com

0bbb2f3d

usb: Simplify the parameter parsing of the legacy usb serial device · e14935df

由 Thomas Huth 提交于 5月 19, 2017

Coverity complains about the current code, so let's get rid of
the now unneeded while loop and simply always emit "unrecognized
serial USB option" for all unsupported options.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 1495177204-16808-1-git-send-email-thuth@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>

e14935df

ehci: fix overflow in frame timer code · 3ae7eb88

由 Gerd Hoffmann 提交于 5月 15, 2017

In case the frame timer doesn't run for a while due to the host being
busy skipped_uframes can become big enough that UFRAME_TIMER_NS *
skipped_uframes overflows.  Which in turn throws off all subsequent
ehci frame timer calculations.
Reported-by: N李林 <8610_28@163.com>
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170515104543.32044-1-kraxel@redhat.com

3ae7eb88

pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry · ede24a02

由 Ladi Prosek 提交于 5月 25, 2017

For reasons unknown, Windows won't online all memory, both at command
line and hot-plugged later, unless the hotplug mem hole SRAT entry
specifies a node greater than or equal to the ones where memory is
added.

Using the highest node on the machine makes recent versions of Windows
happy.

With this example command line:
  ... \
  -m 1024,slots=4,maxmem=32G \
  -numa node,nodeid=0 \
  -numa node,nodeid=1 \
  -numa node,nodeid=2 \
  -numa node,nodeid=3 \
  -object memory-backend-ram,size=1G,id=mem-mem1 \
  -device pc-dimm,id=dimm-mem1,memdev=mem-mem1,node=1

Windows reports a total of 1G of RAM without this commit and the expected
2G with this commit.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NLaszlo Ersek <lersek@redhat.com>

ede24a02

26 5月, 2017 11 次提交

nvme: Add support for Controller Memory Buffers · a896f7f2

由 Stephen Bates 提交于 5月 16, 2017

Implement NVMe Controller Memory Buffers (CMBs) which were added in
version 1.2 of the NVMe Specification. This patch adds an optional
argument (cmb_size_mb) which indicates the size of the CMB (in
MB). Currently only the Submission Queue Support (SQS) is enabled
which aligns with the current Linux driver for NVMe.
Signed-off-by: NStephen Bates <sbates@raithlin.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

a896f7f2

vhost-user: pass message as a pointer to process_message_reply() · 3cf7daf8

由 Maxime Coquelin 提交于 5月 24, 2017

process_message_reply() was recently updated to get full message
content instead of only its request field.

There is no need to copy all the struct content into the stack,
so just pass its pointer as const.
Reviewed-by: NJens Freimann <jfreiman@redhat.com>
Reviewed-by: NZhiyong Yang <zhiyong.yang@intel.com>
Signed-off-by: NMaxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NMarc-André Lureau <marcandre.lureau@redhat.com>

3cf7daf8

virtio_net: Bypass backends for MTU feature negotiation · 75ebec11

由 Maxime Coquelin 提交于 5月 23, 2017

This patch adds a new internal "x-mtu-bypass-backend" property
to bypass backends for MTU feature negotiation.

When this property is set, the MTU feature is negotiated as soon
as supported by the guest and a MTU value is set via the host_mtu
parameter. In case the backend advertises the feature (e.g. DPDK's
vhost-user backend), the feature negotiation is propagated down to
the backend.

When this property is not set, the backend has to support the MTU
feature for its negotiation to succeed.

For compatibility purpose, this property is disabled for machine
types v2.9 and older.

Cc: Aaron Conole <aconole@redhat.com>
Suggested-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMaxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: NVlad Yasevich <vyasevic@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

75ebec11

intel_iommu: support passthrough (PT) · dbaabb25

由 Peter Xu 提交于 5月 19, 2017

Hardware support for VT-d device passthrough. Although current Linux can
live with iommu=pt even without this, but this is faster than when using
software passthrough.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NLiu, Yi L <yi.l.liu@linux.intel.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

dbaabb25

intel_iommu: allow dev-iotlb context entry conditionally · f80c9874

由 Peter Xu 提交于 5月 19, 2017

When device-iotlb is not specified, we should fail this check. A new
function vtd_ce_type_check() is introduced.

While I'm at it, clean up the vtd_dev_to_context_entry() a bit - replace
many "else if" usage into direct if check. That'll make the logic more
clear.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

f80c9874

intel_iommu: use IOMMU_ACCESS_FLAG() · 5a38cb59

由 Peter Xu 提交于 5月 19, 2017

We have that now, so why not use it.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

5a38cb59

intel_iommu: provide vtd_ce_get_type() · 127ff5c3

由 Peter Xu 提交于 5月 19, 2017

Helper to fetch VT-d context entry type.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

127ff5c3

intel_iommu: renaming context entry helpers · 8f7d7161

由 Peter Xu 提交于 5月 19, 2017

The old names are too long and less ordered. Let's start to use
vtd_ce_*() as a pattern.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

8f7d7161

x86-iommu: use DeviceClass properties · 0b77d30a

由 Peter Xu 提交于 5月 19, 2017

No reason to keep tens of lines if we can do it actually far shorter.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

0b77d30a

memory: remove the last param in memory_region_iommu_replay() · ad523590

由 Peter Xu 提交于 5月 19, 2017

We were always passing in that one as "false" to assume that's an read
operation, and we also assume that IOMMU translation would always have
that read permission. A better permission would be IOMMU_NONE since the
replay is after all not a real read operation, but just a page table
rebuilding process.

CC: David Gibson <david@gibson.dropbear.id.au>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

ad523590

memory: tune last param of iommu_ops.translate() · bf55b7af

由 Peter Xu 提交于 5月 19, 2017

This patch converts the old "is_write" bool into IOMMUAccessFlags. The
difference is that "is_write" can only express either read/write, but
sometimes what we really want is "none" here (neither read nor write).
Replay is an good example - during replay, we should not check any RW
permission bits since thats not an actual IO at all.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>

bf55b7af

25 5月, 2017 13 次提交

9pfs: local: metadata file for the VirtFS root · 81ffbf5a

由 Greg Kurz 提交于 5月 25, 2017

When using the mapped-file security, credentials are stored in a metadata
directory located in the parent directory. This is okay for all paths with
the notable exception of the root path, since we don't want and probably
can't create a metadata directory above the virtfs directory on the host.

This patch introduces a dedicated metadata file, sitting in the virtfs root
for this purpose. It relies on the fact that the "." name necessarily refers
to the virtfs root.

As for the metadata directory, we don't want the client to see this file.
The current code only cares for readdir() but there are many other places
to fix actually. The filtering logic is hence put in a separate function.

Before:

# ls -ld
drwxr-xr-x. 3 greg greg 4096 May  5 12:49 .
# chown root.root .
chown: changing ownership of '.': Is a directory
# ls -ld
drwxr-xr-x. 3 greg greg 4096 May  5 12:49 .

After:

# ls -ld
drwxr-xr-x. 3 greg greg 4096 May  5 12:49 .
# chown root.root .
# ls -ld
drwxr-xr-x. 3 root root 4096 May  5 12:50 .

and from the host:

ls -al .virtfs_metadata_root
-rwx------. 1 greg greg 26 May  5 12:50 .virtfs_metadata_root
$ cat .virtfs_metadata_root
virtfs.uid=0
virtfs.gid=0
Reported-by: NLeo Gaspard <leo@gaspard.io>
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NEric Blake <eblake@redhat.com>
Tested-by: NLeo Gaspard <leo@gaspard.io>
[groug: work around a patchew false positive in
        local_set_mapped_file_attrat()]

81ffbf5a

9pfs: local: simplify file opening · 3dbcf273

由 Greg Kurz 提交于 5月 25, 2017

The logic to open a path currently sits between local_open_nofollow() and
the relative_openat_nofollow() helper, which has no other user.

For the sake of clarity, this patch moves all the code of the helper into
its unique caller. While here we also:
- drop the code to skip leading "/" because the backend isn't supposed to
  pass anything but relative paths without consecutive slashes. The assert()
  is kept because we really don't want a buggy backend to pass an absolute
  path to openat().
- use strchrnul() to get a simpler code. This is ok since virtfs is for
  linux+glibc hosts only.
- don't dup() the initial directory and add an assert() to ensure we don't
  return the global mountfd to the caller. BTW, this would mean that the
  caller passed an empty path, which isn't supposed to happen either.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NEric Blake <eblake@redhat.com>
[groug: fixed typos in changelog]

3dbcf273

9pfs: local: resolve special directories in paths · f57f5878

由 Greg Kurz 提交于 5月 25, 2017

When using the mapped-file security mode, the creds of a path /foo/bar
are stored in the /foo/.virtfs_metadata/bar file. This is okay for all
paths unless they end with '.' or '..', because we cannot create the
corresponding file in the metadata directory.

This patch ensures that '.' and '..' are resolved in all paths.

The core code only passes path elements (no '/') to the backend, with
the notable exception of the '/' path, which refers to the virtfs root.
This patch preserves the current behavior of converting it to '.' so
that it can be passed to "*at()" syscalls ('/' would mean the host root).
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NEric Blake <eblake@redhat.com>

f57f5878

9pfs: check return value of v9fs_co_name_to_path() · 4fa62005

由 Greg Kurz 提交于 5月 25, 2017

These v9fs_co_name_to_path() call sites have always been around. I guess
no care was taken to check the return value because the name_to_path
operation could never fail at the time. This is no longer true: the
handle and synth backends can already fail this operation, and so will the
local backend soon.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NEric Blake <eblake@redhat.com>

4fa62005

9pfs: assume utimensat() and futimens() are present · 24df3371

由 Greg Kurz 提交于 5月 25, 2017

The utimensat() and futimens() syscalls have been around for ages (ie,
glibc 2.6 and linux 2.6.22), and the decision was already taken to
switch to utimensat() anyway when fixing CVE-2016-9602 in 2.9.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NEric Blake <eblake@redhat.com>

24df3371

9pfs: local: fix unlink of alien files in mapped-file mode · 6a87e792

由 Greg Kurz 提交于 5月 25, 2017

When trying to remove a file from a directory, both created in non-mapped
mode, the file remains and EBADF is returned to the guest.

This is a regression introduced by commit "df4938a6 9pfs: local:
unlinkat: don't follow symlinks" when fixing CVE-2016-9602. It changed the
way we unlink the metadata file from

    ret = remove("$dir/.virtfs_metadata/$name");
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

to

    fd = openat("$dir/.virtfs_metadata")
    unlinkat(fd, "$name")
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

If $dir was created in non-mapped mode, openat() fails with ENOENT and
we pass -1 to unlinkat(), which fails in turn with EBADF.

We just need to check the return of openat() and ignore ENOENT, in order
to restore the behaviour we had with remove().
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NEric Blake <eblake@redhat.com>
[groug: rewrote the comments as suggested by Eric]

6a87e792

9pfs: drop pdu_push_and_notify() · a17d8659

由 Greg Kurz 提交于 5月 25, 2017

Only pdu_complete() needs to notify the client that a request has completed.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>

a17d8659

virtio-9p/xen-9p: move 9p specific bits to core 9p code · 506f3275

由 Greg Kurz 提交于 5月 25, 2017

These bits aren't related to the transport so let's move them to the core
code.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>

506f3275

xics: add unrealize handler · 62f94fc9

由 Greg Kurz 提交于 5月 24, 2017

Now that ICPState objects get finalized on CPU unplug, we should unregister
reset handlers as well to avoid a QEMU crash at machine reset time.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

62f94fc9

hw/ppc/spapr.c: recover pending LMB unplug info in spapr_lmb_release · 16ee9980

由 Daniel Henrique Barboza 提交于 5月 22, 2017

When a LMB hot unplug starts, the current DRC LMB status is stored at
spapr->pending_dimm_unplugs QTAILQ. This queue isn't migrated, thus
if a migration occurs in the middle of a LMB unplug the
spapr_lmb_release callback will lost track of the LMB unplug progress.

This patch implements a new recover function spapr_recover_pending_dimm_state
that is used inside spapr_lmb_release to recover this DRC LMB release
status that is lost during the migration.
Signed-off-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Minor stylistic changes, simplify error handling]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

16ee9980

hw/ppc: migrating the DRC state of hotplugged devices · a50919dd

由 Daniel Henrique Barboza 提交于 5月 22, 2017

In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'indicator_state'
are involved in the DR state transition diagram from
PAPR+ 2.7, 13.4;

- 'configured', 'signalled', 'awaiting_release' and 'awaiting_allocation'
are needed in attaching and detaching devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).
Signed-off-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

a50919dd

hw/ppc: removing drc->detach_cb and drc->detach_cb_opaque · 31834723

由 Daniel Henrique Barboza 提交于 5月 22, 2017

The pointer drc->detach_cb is being used as a way of informing
the detach() function inside spapr_drc.c which cb to execute. This
information can also be retrieved simply by checking drc->type and
choosing the right callback based on it. In this context, detach_cb
is redundant information that must be managed.

After the previous spapr_lmb_release change, no detach_cb_opaques
are being used by any of the three callbacks functions. This is
yet another information that is now unused and, on top of that, can't
be migrated either.

This patch makes the following changes:

- removal of detach_cb_opaque. the 'opaque' argument was removed from
the callbacks and from the detach() function of sPAPRConnectorClass. The
attribute detach_cb_opaque of sPAPRConnector was removed.

- removal of detach_cb from the detach() call. The function pointer
detach_cb of sPAPRConnector was removed. detach() now uses a
switch(drc->type) to execute the apropriate callback. To achieve this,
spapr_core_release, spapr_lmb_release and spapr_phb_remove_pci_device_cb
callbacks were made public to be visible inside detach().
Signed-off-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

31834723

hw/ppc/spapr.c: adding pending_dimm_unplugs to sPAPRMachineState · 0cffce56

由 David Gibson 提交于 5月 24, 2017

The LMB DRC release callback, spapr_lmb_release(), uses an opaque
parameter, a sPAPRDIMMState struct that stores the current LMBs that
are allocated to a DIMM (nr_lmbs). After each call to this callback,
the nr_lmbs is decremented by one and, when it reaches zero, the callback
proceeds with the qdev calls to hot unplug the LMB.

Using drc->detach_cb_opaque is problematic because it can't be migrated in
the future DRC migration work. This patch makes the following changes to
eliminate the usage of this opaque callback inside spapr_lmb_release:

- sPAPRDIMMState was moved from spapr.c and added to spapr.h. A new
attribute called 'addr' was added to it. This is used as an unique
identifier to associate a sPAPRDIMMState to a PCDIMM element.

- sPAPRMachineState now hosts a new QTAILQ called 'pending_dimm_unplugs'.
This queue of sPAPRDIMMState elements will store the DIMM state of DIMMs
that are currently going under an unplug process.

- spapr_lmb_release() will now retrieve the nr_lmbs value by getting the
correspondent sPAPRDIMMState. A helper function called spapr_dimm_get_address
was created to fetch the address of a PCDIMM device inside spapr_lmb_release.
When nr_lmbs reaches zero and the callback proceeds with the qdev hot unplug
calls, the sPAPRDIMMState struct is removed from spapr->pending_dimm_unplugs.

After these changes, the opaque argument for spapr_lmb_release is now
unused and is passed as NULL inside spapr_del_lmbs. This and the other
opaque arguments can now be safely removed from the code.

As an additional cleanup made by this patch, the spapr_del_lmbs function
was merged with spapr_memory_unplug_request. The former was being called
only by the latter and both were small enough to fit one single function.
Signed-off-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Minor stylistic cleanups]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

0cffce56

24 5月, 2017 8 次提交

spapr: add pre_plug function for memory · c871bc70

由 Laurent Vivier 提交于 5月 23, 2017

This allows to manage errors before the memory
has started to be hotplugged. We already have
the function for the CPU cores.
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
[dwg: Fixed a couple of style nits]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

c871bc70

pseries: Restore support for total vcpus not a multiple of threads-per-core for old machine types · 459264ef

由 David Gibson 提交于 5月 23, 2017

As of pseries-2.7 and later, we require the total number of guest vcpus to
be a multiple of the threads-per-core. pseries-2.6 and earlier machine
types, however, are supposed to allow this for the sake of migration from
old qemu versions which allowed this.

Unfortunately, 8149e299 "pseries: Enforce homogeneous threads-per-core"
broke this by not considering the old machine type case. This fixes it by
only applying the check when the machine type supports hotpluggable cpus.
By not-entirely-coincidence, that corresponds to the same time when we
started enforcing total threads being a multiple of threads-per-core.

Fixes: 8149e299Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Tested-by: NGreg Kurz <groug@kaod.org>

459264ef

pseries: Split CAS PVR negotiation out into a separate function · 80c33d34

由 David Gibson 提交于 5月 18, 2017

Guests of the qemu machine type go through a feature negotiation process
known as "client architecture support" (CAS) during early boot.  This does
a number of things, one of which is finding a CPU compatibility mode which
can be supported by both guest and host.

In fact the CPU negotiation is probably the single most complex part of the
CAS process, so this splits it out into a helper function.  We've recently
made some mistakes in maintaining backward compatibility for old machine
types here.  Splitting this out will also make it easier to fix this.

This also adds a possibly useful error message if the negotiation fails
(i.e. if there isn't a CPU mode that's suitable for both guest and host).
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>

80c33d34

spapr: fix error reporting in xics_system_init() · 3d85885a

由 Greg Kurz 提交于 5月 19, 2017

If the user explicitely asked for kernel-irqchip support and "xics-kvm"
initialization fails, we shouldn't fallback to emulated "xics" as we
do now. It is also awkward to print an error message when we have an
errp pointer argument.

Let's use the errp argument to report the error and let the caller decide.
This simplifies the code as we don't need a local Error * here.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

3d85885a

spapr_cpu_core: drop reference on ICP object during CPU realization · 249127d0

由 Greg Kurz 提交于 5月 19, 2017

When a piece of code allocates an object, it implicitely gets a reference
on it. If it then makes that object a child property of another object, it
should drop its own reference at some point otherwise the child object can
never be finalized. The current code hence leaks one ICP object per CPU
when hot-removing a core.

Failing to add a newly allocated ICP object to the CPU is a bug. While here,
let's ensure QEMU aborts if this ever happens.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

249127d0

hw/ppc/spapr_events.c: removing 'exception' from sPAPREventLogEntry · bff30638

由 Daniel Henrique Barboza 提交于 5月 19, 2017

Currenty we do not have any RTAS event that is reported by the
event-scan interface. The existing events, RTAS_LOG_TYPE_EPOW and
RTAS_LOG_TYPE_HOTPLUG, are being reported by the check-exception
interface and, as such, marked as 'exception=true'.

Commit 79853e18, 'spapr_events: event-scan RTAS interface', added
the event_scan interface because the guest kernel requires it to
initialize other required interfaces. It is acting since then as
a stub because no events that would be reported by it were added
since then. However, the existence of the 'exception' boolean adds
an unnecessary load in the future migration of the pending_events,
sPAPREventLogEntry QTAILQ that hosts the pending RTAS events.

To make the code cleaner and ease the future migration changes, this
patch makes the following changes:

- remove the 'exception' boolean that filter these events. There is
nothing to filter since all events are reported by check-exception;

- functions rtas_event_log_queue, rtas_event_log_dequeue and
rtas_event_log_contains don't receive the 'exception' boolean
as parameter;

- event_scan function was simplified. It was calling
'rtas_event_log_dequeue(mask, false)' that was always returning
'NULL' because we have no events that are created with
exception=false, thus in the end it would execute a jump to
'out_no_events' all the time. The function now assumes that
this will always be the case and all the remaining logic were
deleted.

In the future, when or if we add new RTAS events that should
be reported with the event_scan interface, we can refer to
the changes made in this patch to add the event_scan logic
back.
Signed-off-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

bff30638

spapr: ensure core_slot isn't NULL in spapr_core_unplug() · 07572c06

由 Greg Kurz 提交于 5月 18, 2017

If we go that far on the path of hot-removing a core and we find out that
the core-id is invalid, then we have a serious bug.

Let's make it explicit with an assert() instead of dereferencing a NULL
pointer.

This fixes Coverity issue CID 1375404.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

07572c06

xics_kvm: cache already enabled vCPU ids · de86eccc

由 Greg Kurz 提交于 5月 17, 2017

Since commit a45863bd ("xics_kvm: Don't enable KVM_CAP_IRQ_XICS if
already enabled"), we were able to re-hotplug a vCPU that had been hot-
unplugged ealier, thanks to a boolean flag in ICPState that we set when
enabling KVM_CAP_IRQ_XICS.

This could work because the lifecycle of all ICPState objects was the
same as the machine. Commit 5bc8d26d ("spapr: allocate the ICPState
object from under sPAPRCPUCore") broke this assumption and now we always
pass a freshly allocated ICPState object (ie, with the flag unset) to
icp_kvm_cpu_setup().

This cause re-hotplug to fail with:

Unable to connect CPU8 to kernel XICS: Device or resource busy

Let's fix this by caching all the vCPU ids for which KVM_CAP_IRQ_XICS was
enabled. This also drops the now useless boolean flag from ICPState.
Reported-by: NLaurent Vivier <lvivier@redhat.com>
Signed-off-by: NGreg Kurz <groug@kaod.org>
Tested-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

de86eccc