提交 · 2cd908d0add803886c310084145fecc93f080a63 · openeuler / qemu

01 3月, 2017 8 次提交

ppc/xics: use the QOM interface to resend irqs · 2cd908d0

由 Cédric Le Goater 提交于 2月 27, 2017

Also change the ICPState 'xics' backlink to be a XICSFabric, this
removes the need of using qdev_get_machine() to get the QOM interface
in some of the routines.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

2cd908d0

ppc/xics: use the QOM interface under the sPAPR machine · 7844e12b

由 Cédric Le Goater 提交于 2月 27, 2017

Add 'ics_get' and 'ics_resend' handlers to the sPAPR machine. These
are relatively simple for a single ICS.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

7844e12b

ppc/xics: store the ICS object under the sPAPR machine · 681bfade

由 Cédric Le Goater 提交于 2月 27, 2017

A list of ICS objects was introduced under the XICS object for the
PowerNV machine but, for the sPAPR machine, it brings extra complexity
as there is only a single ICS. To simplify the code, let's add the ICS
pointer under the sPAPR machine and try to reduce the use of this list
where possible.

Also, change the xics_spapr_*() routines to use an ICS object instead
of an XICSState and change their name to reflect that these are
specific to the sPAPR ICS object.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

681bfade

ppc/xics: remove set_nr_servers() handler from XICSStateClass · 817bb6a4

由 Cédric Le Goater 提交于 2月 27, 2017

Today, the ICP (Interrupt Controller Presenter) objects are created by
the 'nr_servers' property handler of the XICS object and a class
handler. They are realized in the XICS object realize routine.

Let's simplify the process by creating the ICP objects along with the
XICS object at the machine level.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

817bb6a4

ppc/xics: remove set_nr_irqs() handler from XICSStateClass · 4e4169f7

由 Cédric Le Goater 提交于 2月 27, 2017

Today, the ICS (Interrupt Controller Source) object is created and
realized by the init and realize routines of the XICS object, but some
of the parameters are only known at the machine level.

These parameters are passed from the sPAPR machine to the ICS object
in a rather convoluted way using property handlers and a class handler
of the XICS object. The number of irqs required to allocate the IRQ
state objects in the ICS realize routine is one of them.

Let's simplify the process by creating the ICS object along with the
XICS object at the machine level and link the ICS into the XICS list
of ICSs at this level also. In the sPAPR machine, there is only a
single ICS but that will change with the PowerNV machine.

Also, QOMify the creation of the objects and get rid of the
superfluous code.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

4e4169f7

xics: XICS should not be a SysBusDevice · 738d5db8

由 David Gibson 提交于 2月 14, 2017

Currently xics - the component of the IBM POWER interrupt controller
representing the overall interrupt fabric / architecture is
represented as a descendent of SysBusDevice.  However, this is not
really correct - the xics presents nothing in MMIO space so it should
be an "unattached" device in the current QOM model.

Since this device will always be created by the machine type, not created
specifically from the command line, and because it has no migrated state
it should be safe to move it around the device composition tree.

Therefore this patch changes it to a descendent of TYPE_DEVICE, and
makes it an unattached device.  So that its reset handler still gets
called correctly, we add a qdev_set_parent_bus() to attach it to
sysbus.  It's not really clear that's correct (instead of using
register_reset()) but it appears to a common technique.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
[clg corrected problems with reset]
Signed-off-by: NCédric Le Goater <clg@kaod.org>
[dwg folded together and updated commit message]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

738d5db8

target/ppc: Manage external HPT via virtual hypervisor · e57ca75c

由 David Gibson 提交于 2月 23, 2017

The pseries machine type implements the behaviour of a PAPR compliant
hypervisor, without actually executing such a hypervisor on the virtual
CPU. To do this we need some hooks in the CPU code to make hypervisor
facilities get redirected to the machine instead of emulated internally.

For hypercalls this is managed through the cpu->vhyp field, which points
to a QOM interface with a method implementing the hypercall.

For the hashed page table (HPT) - also a hypervisor resource - we use an
older hack. CPUPPCState has an 'external_htab' field which when non-NULL
indicates that the HPT is stored in qemu memory, rather than within the
guest's address space.

For consistency - and to make some future extensions easier - this merges
the external HPT mechanism into the vhyp mechanism. Methods are added
to vhyp for the basic operations the core hash MMU code needs: map_hptes()
and unmap_hptes() for reading the HPT, store_hpte() for updating it and
hpt_mask() to retrieve its size.

To match this, the pseries machine now sets these vhyp fields in its
existing vhyp class, rather than reaching into the cpu object to set the
external_htab field.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>

e57ca75c

sysemu: support up to 1024 vCPUs · 6244bb7e

由 Greg Kurz 提交于 2月 24, 2017

Some systems can already provide more than 255 hardware threads.

Bumping the QEMU limit to 1024 seems reasonable:
- it has no visible overhead in top;
- the limit itself has no effect on hot paths.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

6244bb7e

24 2月, 2017 1 次提交

tcg: drop global lock during TCG code execution · 8d04fb55

由 Jan Kiszka 提交于 2月 23, 2017

This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: NEmilio G. Cota <cota@braap.org>
[AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>
[PM: target-arm changes]
Acked-by: NPeter Maydell <peter.maydell@linaro.org>

8d04fb55

22 2月, 2017 6 次提交

hw/ppc/spapr: Check for valid page size when hot plugging memory · df587133

由 Thomas Huth 提交于 2月 15, 2017

On POWER, the valid page sizes that the guest can use are bound
to the CPU and not to the memory region. QEMU already has some
fancy logic to find out the right maximum memory size to tell
it to the guest during boot (see getrampagesize() in the file
target/ppc/kvm.c for more information).
However, once we're booted and the guest is using huge pages
already, it is currently still possible to hot-plug memory regions
that does not support huge pages - which of course does not work
on POWER, since the guest thinks that it is possible to use huge
pages everywhere. The KVM_RUN ioctl will then abort with -EFAULT,
QEMU spills out a not very helpful error message together with
a register dump and the user is annoyed that the VM unexpectedly
died.
To avoid this situation, we should check the page size of hot-plugged
DIMMs to see whether it is possible to use it in the current VM.
If it does not fit, we can print out a better error message and
refuse to add it, so that the VM does not die unexpectely and the
user has a second chance to plug a DIMM with a matching memory
backend instead.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1419466Signed-off-by: NThomas Huth <thuth@redhat.com>
[dwg: Fix a build error on 32-bit builds with KVM]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

df587133

machine: replace query_hotpluggable_cpus() callback with has_hotpluggable_cpus flag · c5514d0e

由 Igor Mammedov 提交于 2月 10, 2017

Generic helper machine_query_hotpluggable_cpus() replaced
target specific query_hotpluggable_cpus() callbacks so
there is no need in it anymore. However inon NULL callback
value is used to detect/report hotpluggable cpus support,
therefore it can be removed completely.
Replace it with MachineClass.has_hotpluggable_cpus boolean
which is sufficient for the task.
Suggested-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

c5514d0e

machine: unify [pc_|spapr_]query_hotpluggable_cpus() callbacks · f2d672c2

由 Igor Mammedov 提交于 2月 09, 2017

All callbacks FOO_query_hotpluggable_cpus() are practically
the same except of setting vcpus_count to different values.
Convert them to a generic machine_query_hotpluggable_cpus()
callback by moving vcpus_count initialization to per machine
specific callback possible_cpu_arch_ids().
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

f2d672c2

spapr: reuse machine->possible_cpus instead of cores[] · 535455fd

由 Igor Mammedov 提交于 2月 10, 2017

Replace SPAPR specific cores[] array with generic
machine->possible_cpus and store core objects there.
It makes cores bookkeeping similar to x86 cpus and
will allow to unify similar code.
It would allow to replace cpu_index based NUMA node
mapping with iproperty based one (for -device created
cores) since possible_cpus carries board defined
topology/layout.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

535455fd

spapr: make cpu core unplug follow expected hotunplug call flow · 115debf2

由 Igor Mammedov 提交于 2月 02, 2017

spapr_core_unplug() were essentially spapr_core_unplug_request()
handler that requested CPU removal and registered callback
which did actual cpu core removali but it was called from
spapr_machine_device_unplug() which is intended for actual object
removal. Commit (cf632463 spapr: Memory hot-unplug support)
sort of fixed it introducing spapr_machine_device_unplug_request()
and calling spapr_core_unplug() but it hasn't renamed callback and
by mistake calls it from spapr_machine_device_unplug().

However spapr_machine_device_unplug() isn't ever called for
cpu core since spapr_core_release() doesn't follow expected
hotunplug call flow which is:
 1: device_del() ->
        hotplug_handler_unplug_request() ->
            set destroy_cb()
 2: destroy_cb() ->
        hotplug_handler_unplug() ->
            object_unparent // actual device removal

Fix it by renaming spapr_core_unplug() to spapr_core_unplug_request()
which is called from spapr_machine_device_unplug_request() and
making spapr_core_release() call hotplug_handler_unplug() which
will call spapr_machine_device_unplug() -> spapr_core_unplug()
to remove cpu core.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reveiwed-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

115debf2

spapr: move spapr_core_[foo]plug() callbacks close to machine code in spapr.c · ff9006dd

由 Igor Mammedov 提交于 2月 02, 2017

spapr_core_pre_plug/spapr_core_plug/spapr_core_unplug() are managing
wiring CPU core into spapr machine state and not internal CPU core state.
So move them from spapr_cpu_core.c to spapr.c where other similar
(spapr_memory_[foo]plug()) callbacks are located, which also matches
x86 target practice.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ff9006dd

01 2月, 2017 1 次提交

ppc: switch to constants within BUILD_BUG_ON · 32f825de

由 Michael S. Tsirkin 提交于 1月 27, 2017

We are switching BUILD_BUG_ON to verify that it's parameter is a
compile-time constant, and it turns out that some gcc versions
(specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
not smart enough to figure it out for expressions involving local
variables. This is harmless but means that the check is ineffective for
these platforms.  To fix, replace the variable with macros.
Reported-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

32f825de

31 1月, 2017 10 次提交

ppc: switch to constants within BUILD_BUG_ON · 25e6a118

由 Michael S. Tsirkin 提交于 1月 27, 2017

We are switching BUILD_BUG_ON to verify that it's parameter is a
compile-time constant, and it turns out that some gcc versions
(specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
not smart enough to figure it out for expressions involving local
variables. This is harmless but means that the check is ineffective for
these platforms.  To fix, replace the variable with macros.
Reported-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
[dwg: Correct a printf format warning]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

25e6a118

spapr: clock should count only if vm is running · 42043e4f

由 Laurent Vivier 提交于 1月 27, 2017

This is a port to ppc of the i386 commit:
    00f4d64e kvmclock: clock should count only if vm is running

We remove timebase_post_load function, and use the VM state
change handler to save and restore the guest_timebase (on stop
and continue).

We keep timebase_pre_save to reduce the clock difference on
migration like in:
    6053a86f kvmclock: reduce kvmclock difference on migration

Time base offset has originally been introduced by commit
    98a8b524 spapr: Add support for time base offset migration

So while VM is paused, the time is stopped. This allows to have
the same result with date (based on Time Base Register) and
hwclock (based on "get-time-of-day" RTAS call).

Moreover in TCG mode, the Time Base is always paused, so this
patch also adjust the behavior between TCG and KVM.

VM state field "time_of_the_day_ns" is now useless but we keep
it to be able to migrate to older version of the machine.

As vmstate_ppc_timebase structure (with timebase_pre_save() and
timebase_post_load() functions) was only used by vmstate_spapr,
we register the VM state change handler only in ppc_spapr_init().
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

42043e4f

ppc: Rewrite ppc_get_compat_smt_threads() · 12dbeb16

由 David Gibson 提交于 10月 28, 2016

To continue consolidation of compatibility mode information, this rewrites
the ppc_get_compat_smt_threads() function using the table of compatiblity
modes in target-ppc/compat.c.

It's not a direct replacement, the new ppc_compat_max_threads() function
has simpler semantics - it just returns the number of threads the cpu
model has, taking into account any compatiblity mode it is in.

This no longer takes into account kvmppc_smt_threads() as the previous
version did.  That check wasn't useful because we check in
ppc_cpu_realizefn() that CPUs aren't instantiated with more threads
than kvm allows (or if we didn't things will already be broken and
this won't make it any worse).
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

12dbeb16

pseries: Add pseries-2.9 machine type · fa325e6c

由 David Gibson 提交于 12月 08, 2016

Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>

fa325e6c

hw/ppc/spapr: Fix boot path of usb-host storage devices · b99260eb

由 Thomas Huth 提交于 12月 14, 2016

When passing through an USB storage device to a pseries guest, it
is currently not possible to automatically boot from the device
if the "bootindex" property has been specified, too (e.g. when using
"-device nec-usb-xhci -device usb-host,hostbus=1,hostaddr=2,bootindex=0"
at the command line). The problem is that QEMU builds a device tree path
like "/pci@800000020000000/usb@0/usb-host@1" and passes it to SLOF
in the /chosen/qemu,boot-list property. SLOF, however, probes the
USB device, recognizes that it is a storage device and thus changes
its name to "storage", and additionally adds a child node for the
SCSI LUN, so the correct boot path in SLOF is something like
"/pci@800000020000000/usb@0/storage@1/disk@101000000000000" instead.
So when we detect an USB mass storage device with SCSI interface,
we've got to adjust the firmware boot-device path properly that
SLOF can automatically boot from the device.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1354177Signed-off-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

b99260eb

ppc/spapr: implement H_SIGNAL_SYS_RESET · 1c7ad77e

由 Nicholas Piggin 提交于 12月 05, 2016

The H_SIGNAL_SYS_RESET hcall allows a guest CPU to raise a system reset
exception on CPUs within the same guest -- all CPUs, all-but-self, or a
specific CPU (including self).

This has not made its way to a PAPR release yet, but we have an hcall
number assigned.

  H_SIGNAL_SYS_RESET = 0x380

  Syntax:
    hcall(uint64 H_SIGNAL_SYS_RESET, int64 target);

  Generate a system reset NMI on the threads indicated by target.

  Values for target:
    -1 = target all online threads including the caller
    -2 = target all online threads except for the caller
    All other negative values: reserved
    Positive values: The thread to be targeted, obtained from the value
    of the "ibm,ppc-interrupt-server#s" property of the CPU in the OF
    device tree.

  Semantics:
    - Invalid target: return H_Parameter.
    - Otherwise: Generate a system reset NMI on target thread(s),
      return H_Success.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

1c7ad77e

ppc: Rename cpu_version to compat_pvr · d6e166c0

由 David Gibson 提交于 10月 28, 2016

The 'cpu_version' field in PowerPCCPU is badly named.  It's named after the
'cpu-version' device tree property where it is advertised, but that meaning
may not be obvious in most places it appears.

Worse, it doesn't even really correspond to that device tree property.  The
property contains either the processor's PVR, or, if the CPU is running in
a compatibility mode, a special "logical PVR" representing which mode.

Rename the cpu_version field, and a number of related variables to
compat_pvr to make this clearer.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NThomas Huth <thuth@redhat.com>

d6e166c0

ppc: Clean up and QOMify hypercall emulation · 1d1be34d

由 David Gibson 提交于 10月 28, 2016

The pseries machine type is a bit unusual in that it runs a paravirtualized
guest.  The guest expects to interact with a hypervisor, and qemu
emulates the functions of that hypervisor directly, rather than executing
hypervisor code within the emulated system.

To implement this in TCG, we need to intercept hypercall instructions and
direct them to the machine's hypercall handlers, rather than attempting to
perform a privilege change within TCG.  This is controlled by a global
hook - cpu_ppc_hypercall.

This cleanup makes the handling a little cleaner and more extensible than
a single global variable.  Instead, each CPU to have hypercalls intercepted
has a pointer set to a QOM object implementing a new virtual hypervisor
interface.  A method in that interface is called by TCG when it sees a
hypercall instruction.  It's possible we may want to add other methods in
future.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

1d1be34d

pseries: Make cpu_update during CAS unconditional · 5b120785

由 David Gibson 提交于 10月 29, 2016

spapr_h_cas_compose_response() includes a cpu_update parameter which
controls whether it includes updated information on the CPUs in the device
tree fragment returned from the ibm,client-architecture-support (CAS) call.

Providing the updated information is essential when CAS has negotiated
compatibility options which require different cpu information to be
presented to the guest. However, it should be safe to provide in other
cases (it will just override the existing data in the device tree with
identical data). This simplifies the code by removing the parameter and
always providing the cpu update information.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

5b120785

pseries: Always use core objects for CPU construction · 0c86d0fd

由 David Gibson 提交于 11月 08, 2016

Currently the pseries machine has two paths for constructing CPUs. On
newer machine type versions, which support cpu hotplug, it constructs
cpu core objects, which in turn construct CPU threads. For older machine
versions it individually constructs the CPU threads.

This division is going to make some future changes to the cpu construction
harder, so this patch unifies them. Now cpu core objects are always
created. This requires some updates to allow core objects to be created
without a full complement of threads (since older versions allowed a
number of cpus not a multiple of the threads-per-core). Likewise it needs
some changes to the cpu core hot/cold plug path so as not to choke on the
old machine types without hotplug support.

For good measure, we move the cpu construction to its own subfunction,
spapr_init_cpus().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>

0c86d0fd

20 1月, 2017 1 次提交

kvm: move cpu synchronization code · b3946626

由 Vincent Palatin 提交于 1月 10, 2017

Move the generic cpu_synchronize_ functions to the common hw_accel.h header,
in order to prepare for the addition of a second hardware accelerator.
Signed-off-by: NStefan Weil <sw@weilnetz.de>
Signed-off-by: NVincent Palatin <vpalatin@chromium.org>
Message-Id: <f5c3cffe8d520011df1c2e5437bb814989b48332.1484045952.git.vpalatin@chromium.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b3946626

01 12月, 2016 1 次提交

spapr: fix default DRC state for coldplugged LMBs · 5c0139a8

由 Michael Roth 提交于 11月 30, 2016

Currently we set the initial isolation/allocation state for DRCs
associated with coldplugged LMBs to ISOLATED/UNUSABLE,
respectively, under the assumption that the guest will move this
state to UNISOLATED/USABLE.

In fact, this is only the case for LMBs added via hotplug. For
coldplugged LMBs, the guest actually assumes the initial state to
be UNISOLATED/USABLE.

In practice, this only becomes an issue when we attempt to unplug
one of these LMBs, where the guest kernel will issue an
rtas-get-sensor-state call to check that the corresponding DRC is
in an USABLE state before it will release the LMB back to
QEMU. If the returned state is otherwise, the guest will assume no
further action is needed, which bypasses the QEMU-side cleanup that
occurs during the USABLE->UNUSABLE transition. This results in
LMBs and their corresponding pc-dimm devices to stick around
indefinitely.

This patch fixes the issue by manually setting DRCs associated with
cold-plugged LMBs to UNISOLATED/ALLOCATED, but leaving the hotplug
state untouched. As it turns out, this is analogous to the handling
for cold-plugged CPUs in spapr_core_plug().

Cc: qemu-ppc@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

5c0139a8

23 11月, 2016 3 次提交

spapr: Fix 2.7<->2.8 migration of PCI host bridge · 5c4537bd

由 David Gibson 提交于 11月 23, 2016

daa23699 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration
from qemu-2.7 to the current version.  It split the device's MMIO
window into two pieces for 32-bit and 64-bit MMIO.

The patch included backwards compatibility code to convert the old
property into the new format.  However, the property value was also
transferred in the migration stream and compared with a (probably
unwise) VMSTATE_EQUAL.  So, the "raw" value from 2.7 is compared to
the new style converted value from (pre-)2.8 giving a mismatch and
migration failure.

Along with the actual field that caused the breakage, there are
several other ill-advised VMSTATE_EQUAL()s.  To fix forwards
migration, we read the values in the stream into scratch variables and
ignore them, instead of comparing for equality.  To fix backwards
migration, we populate those scratch variables in pre_save() with
adjusted values to match the old behaviour.

To permit the eventual possibility of removing this cruft from the
stream, we only include these compatibility fields if a new
'pre-2.8-migration' property is set.  We clear it on the pseries-2.8
machine type, which obviously can't be migrated backwards, but set it
on earlier machine type versions.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

5c4537bd

target-ppc: Allow eventual removal of old migration mistakes · 146c11f1

由 David Gibson 提交于 11月 21, 2016

Until very recently, the vmstate for ppc cpus included some poorly
thought out VMSTATE_EQUAL() components, that can easily break
migration compatibility, and did so between qemu-2.6 and later
versions.  A hack was recently added which fixes this migration
breakage, but it leaves the unhelpful cruft of these fields in the
migration stream.

This patch adds a new cpu property allowing these fields to be removed
from the stream entirely.  For the pseries-2.8 machine type - which
comes after the fix - and for all non-pseries machine types - which
aren't mature enough to care about cross-version migration - we remove
the fields from the stream.

For pseries-2.7 and earlier, The migration hack remains in place,
allowing backwards and forwards migration with the older machine
types.

This restricts the migration compatibility cruft to older machine
types, and at least opens the possibility of eventually deprecating
and removing it entirely.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

146c11f1

spapr: migration support for CAS-negotiated option vectors · 62ef3760

由 Michael Roth 提交于 11月 17, 2016

With the additional of the OV5_HP_EVT option vector, we now have
certain functionality (namely, memory unplug) that checks at run-time
for whether or not the guest negotiated the option via CAS. Because
we don't currently migrate these negotiated values, we are unable
to unplug memory from a guest after it's been migrated until after
the guest is rebooted and CAS-negotiation is repeated.

This patch fixes this by adding CAS-negotiated options to the
migration stream. We do this using a subsection, since the
negotiated value of OV5_HP_EVT is the only option currently needed
to maintain proper functionality for a running guest.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

62ef3760

31 10月, 2016 1 次提交

*_run_on_cpu: introduce run_on_cpu_data type · 14e6fe12

由 Paolo Bonzini 提交于 10月 31, 2016

This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

14e6fe12

28 10月, 2016 8 次提交

clean-up: removed duplicate #includes · 814bb12a

由 Anand J 提交于 10月 21, 2016

Some files contain multiple #includes of the same header file.
Removed most of those unnecessary duplicate entries using
scripts/clean-includes.
Reviewed-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NAnand J <anand.indukala@gmail.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

814bb12a

spapr: Memory hot-unplug support · cf632463

由 Bharata B Rao 提交于 10月 26, 2016

Add support to hot remove pc-dimm memory devices.

Since we're introducing a machine-level unplug_request hook, we also
had handling for CPU unplug there as well to ensure CPU unplug
continues to work as it did before.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
* add hooks to CAS/cmdline enablement of hotplug ACR support
* add hook for CPU unplug
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

cf632463

spapr: use count+index for memory hotplug · 79b78a6b

由 Michael Roth 提交于 10月 26, 2016

Commit 0a417869:

    spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type

dropped per-DRC/per-LMB hotplugs event in favor of a bulk add via a
single LMB count value. This was to avoid overrunning the guest EPOW
event queue with hotplug events. This works fine, but relies on the
guest exhaustively scanning for pluggable LMBs to satisfy the
requested count by issuing rtas-get-sensor(DR_ENTITY_SENSE, ...) calls
until all the LMBs associated with the DIMM are identified.

With newer support for dedicated hotplug event source, this queue
exhaustion is no longer as much of an issue due to implementation
details on the guest side, but we still try to avoid excessive hotplug
events by now supporting both a count and a starting index to avoid
unecessary work. This patch makes use of that approach when the
capability is available.

Cc: bharata@linux.vnet.ibm.com
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

79b78a6b

spapr: add hotplug interrupt machine options · f6229214

由 Michael Roth 提交于 10月 26, 2016

This adds machine options of the form:

  -machine pseries,modern-hotplug-events=true
  -machine pseries,modern-hotplug-events=false

If false, QEMU will force the use of "legacy" style hotplug events,
which are surfaced through EPOW events instead of a dedicated
hot plug event source, and lack certain features necessary, mainly,
for memory unplug support.

If true, QEMU will enable support for "modern" dedicated hot plug
event source. Note that we will still default to "legacy" style unless
the guest advertises support for the "modern" hotplug events via
ibm,client-architecture-support hcall during early boot.

For pseries-2.7 and earlier we default to false, for newer machine
types we default to true.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

f6229214

spapr_events: add support for dedicated hotplug event source · ffbb1705

由 Michael Roth 提交于 10月 26, 2016

Hotplug events were previously delivered using an EPOW interrupt
and were queued by linux guests into a circular buffer. For traditional
EPOW events like shutdown/resets, this isn't an issue, but for hotplug
events there are cases where this buffer can be exhausted, resulting
in the loss of hotplug events, resets, etc.

Newer-style hotplug event are delivered using a dedicated event source.
We enable this in supported guests by adding standard an additional
event source in the guest device-tree via /event-sources, and, if
the guest advertises support for the newer-style hotplug events,
using the corresponding interrupt to signal the available of
hotplug/unplug events.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ffbb1705

spapr: improve ibm,architecture-vec-5 property handling · 417ece33

由 Michael Roth 提交于 10月 24, 2016

ibm,architecture-vec-5 is supposed to encode all option vector 5 bits
negotiated between platform/guest. Currently we hardcode this property
in the boot-time device tree to advertise a single negotiated
capability, "Form 1" NUMA Affinity, regardless of whether or not CAS
has been invoked or that capability has actually been negotiated.

Improve this by generating ibm,architecture-vec-5 based on the full
set of option vector 5 capabilities negotiated via CAS.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

417ece33

spapr: add option vector handling in CAS-generated resets · 6787d27b

由 Michael Roth 提交于 10月 24, 2016

In some cases, ibm,client-architecture-support calls can fail. This
could happen in the current code for situations where the modified
device tree segment exceeds the buffer size provided by the guest
via the call parameters. In these cases, QEMU will reset, allowing
an opportunity to regenerate the device tree from scratch via
boot-time handling. There are potentially other scenarios as well,
not currently reachable in the current code, but possible in theory,
such as cases where device-tree properties or nodes need to be removed.

We currently don't handle either of these properly for option vector
capabilities however. Instead of carrying the negotiated capability
beyond the reset and creating the boot-time device tree accordingly,
we start from scratch, generating the same boot-time device tree as we
did prior to the CAS-generated and the same device tree updates as we
did before. This could (in theory) cause us to get stuck in a reset
loop. This hasn't been observed, but depending on the extensiveness
of CAS-induced device tree updates in the future, could eventually
become an issue.

Address this by pulling capability-related device tree
updates resulting from CAS calls into a common routine,
spapr_dt_cas_updates(), and adding an sPAPROptionVector*
parameter that allows us to test for newly-negotiated capabilities.
We invoke it as follows:

1) When ibm,client-architecture-support gets called, we
call spapr_dt_cas_updates() with the set of capabilities
added since the previous call to ibm,client-architecture-support.
For the initial boot, or a system reset generated by something
other than the CAS call itself, this set will consist of *all*
options supported both the platform and the guest. For calls
to ibm,client-architecture-support immediately after a CAS-induced
reset, we call spapr_dt_cas_updates() with only the set
of capabilities added since the previous call, since the other
capabilities will have already been addressed by the boot-time
device-tree this time around. In the unlikely event that
capabilities are *removed* since the previous CAS, we will
generate a CAS-induced reset. In the unlikely event that we
cannot fit the device-tree updates into the buffer provided
by the guest, well generate a CAS-induced reset.

2) When a CAS update results in the need to reset the machine and
include the updates in the boot-time device tree, we call the
spapr_dt_cas_updates() using the full set of negotiated
capabilities as part of the reset path. At initial boot, or after
a reset generated by something other than the CAS call itself,
this set will be empty, resulting in what should be the same
boot-time device-tree as we generated prior to this patch. For
CAS-induced reset, this routine will be called with the full set of
capabilities negotiated by the platform/guest in the previous
CAS call, which should result in CAS updates from previous call
being accounted for in the initial boot-time device tree.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
[dwg: Changed an int -> bool conversion to be more explicit]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

6787d27b

spapr_hcall: use spapr_ovec_* interfaces for CAS options · facdb8b6

由 Michael Roth 提交于 10月 24, 2016

Currently we access individual bytes of an option vector via
ldub_phys() to test for the presence of a particular capability
within that byte. Currently this is only done for the "dynamic
reconfiguration memory" capability bit. If that bit is present,
we pass a boolean value to spapr_h_cas_compose_response()
to generate a modified device tree segment with the additional
properties required to enable this functionality.

As more capability bits are added, will would need to modify the
code to add additional option vector accesses and extend the
param list for spapr_h_cas_compose_response() to include similar
boolean values for these parameters.

Avoid this by switching to spapr_ovec_* helpers so we can do all
the parsing in one shot and then test for these additional bits
within spapr_h_cas_compose_response() directly.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

facdb8b6