提交 · 6e03cc5cf0dac9ec40dce7e3500b442761bc8e96 · openeuler / qemu

17 1月, 2018 15 次提交

ppc/pnv: change initrd address · fef592f9

由 Cédric Le Goater 提交于 1月 15, 2018

When skiboot starts, it first clears the CPU structs for all possible
CPUs on a system :

	for (i = 0; i <= cpu_max_pir; i++)
		memset(&cpu_stacks[i].cpu, 0, sizeof(struct cpu_thread));

On POWER9, cpu_max_pir is quite big, 0x7fff, and the skiboot cpu_stacks
array overlaps with the memory region in which QEMU maps the initramfs
file. Move it upwards in memory to keep it safe.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

fef592f9

ppc/pnv: fix XSCOM core addressing on POWER9 · c035851a

由 Cédric Le Goater 提交于 1月 15, 2018

The XSCOM base address of the core chiplet was wrongly calculated. Use
the OPAL macros to fix that and do a couple of renames.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

c035851a

ppc/pnv: introduce pnv*_is_power9() helpers · b3b066e9

由 Cédric Le Goater 提交于 1月 15, 2018

These are useful when instantiating device models which are shared
between the POWER8 and the POWER9 processor families.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

b3b066e9

ppc/pnv: change core mask for POWER9 · 09279d7e

由 Cédric Le Goater 提交于 1月 15, 2018

When addressed by XSCOM, the first core has the 0x20 chiplet ID but
the CPU PIR can start at 0x0.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

09279d7e

ppc/pnv: use POWER9 DD2 processor · 83028a2b

由 Cédric Le Goater 提交于 1月 15, 2018

commit 1ed9c8af ("target/ppc: Add POWER9 DD2.0 model information")
deprecated the POWER9 model v1.0.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

83028a2b

spapr: Adjust default VSMT value for better migration compatibility · 8904e5a7

由 David Gibson 提交于 1月 15, 2018

fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core.  This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.

The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest.  This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.

Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things.  Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host.  That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.

Long term we really want to make vsmt == smp_threads for sufficiently
new machine types.  However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.

In the meantime we need some default that will work as often as
possible.  This patch changes that default to 8 in all circumstances.
This does change guest visible behaviour (including for existing
machine versions) for many cases - just not the most common/important
case.

Following is case by case justification for why this is still the least
worst option.  Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.

KVM HV on POWER8 host:
   This is the overwhelmingly common case in production setups, and is
   unchanged by design.  POWER8 hosts will advertise a default VSMT mode
   of 8, and > 8 vthreads/vcore isn't permitted

KVM HV on POWER7 host:
   Will break, but POWER7s allowing KVM were never released to the public.

KVM HV on POWER9 host:
   Not yet released to the public, breaking this now will reduce other
   breakage later.

KVM HV on PowerPC 970:
   Will theoretically break it, but it was barely supported to begin with
   and already required various user visible hacks to work.  Also so old
   that I just don't care.

TCG:
   This is the nastiest one; it means migration of TCG guests (without
   manual vsmt setting) will break.  Since TCG is rarely used in production
   I think this is worth it for the other benefits.  It does also remove
   one more barrier to TCG<->KVM migration which could be interesting for
   debugging applications.

KVM PR:
   As with TCG, this will break migration of existing configurations,
   without adding extra manual vsmt options.  As with TCG, it is rare in
   production so I think the benefits outweigh breakages.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>

8904e5a7

spapr: Allow some cases where we can't set VSMT mode in the kernel · 1f20f2e0

由 David Gibson 提交于 1月 16, 2018

At present if we require a vsmt mode that's not equal to the kernel's
default, and the kernel doesn't let us change it (e.g. because it's an old
kernel without support) then we always fail.

But in fact we can cope with the kernel having a different vsmt as long as
  a) it's >= the actual number of vthreads/vcore (so that guest threads
     that are supposed to be on the same core act like it)
  b) it's a submultiple of the requested vsmt mode (so that guest threads
     spaced by the vsmt value will act like they're on different cores)

Allowing this case gives us a bit more freedom to adjust the vsmt behaviour
without breaking existing cases.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Tested-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NGreg Kurz <groug@kaod.org>

1f20f2e0

target/ppc: Clarify compat mode max_threads value · abbc1247

由 David Gibson 提交于 1月 15, 2018

We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table.  It's all about guest expectations, not host
expectations or support (that's handled elsewhere).

In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>

abbc1247

spapr: Remove unnecessary 'options' field from sPAPRCapabilityInfo · 895d5cd6

由 David Gibson 提交于 1月 15, 2018

The options field here is intended to list the available values for the
capability.  It's not used yet, because the existing capabilities are
boolean.

We're going to add capabilities that aren't, but in that case the info on
the possible values can be folded into the .description field.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

895d5cd6

hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representation · 4e5fe368

由 Suraj Jitindar Singh 提交于 1月 12, 2018

Currently spapr_caps are tied to boolean values (on or off). This patch
reworks the caps so that they can have any uint8 value. This allows more
capabilities with various values to be represented in the same way
internally. Capabilities are numbered in ascending order. The internal
representation of capability values is an array of uint8s in the
sPAPRMachineState, indexed by capability number.

Capabilities can have their own name, description, options, getter and
setter functions, type and allow functions. They also each have their own
section in the migration stream. Capabilities are only migrated if they
were explictly set on the command line, with the assumption that
otherwise the default will match.

On migration we ensure that the capability value on the destination
is greater than or equal to the capability value from the source. So
long at this remains the case then the migration is considered
compatible and allowed to continue.

This patch implements generic getter and setter functions for boolean
capabilities. It also converts the existings cap-htm, cap-vsx and
cap-dfp capabilities to this new format.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

4e5fe368

spapr: Handle Decimal Floating Point (DFP) as an optional capability · 2d1fb9bc

由 David Gibson 提交于 12月 11, 2017

Decimal Floating Point has been available on POWER7 and later (server)
cpus.  However, it can be disabled on the hypervisor, meaning that it's
not available to guests.

We currently handle this by conditionally advertising DFP support in the
device tree depending on whether the guest CPU model supports it - which
can also depend on what's allowed in the host for -cpu host.  That can lead
to confusion on migration, since host properties are silently affecting
guest visible properties.

This patch handles it by treating it as an optional capability for the
pseries machine type.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>

2d1fb9bc

spapr: Handle VMX/VSX presence as an spapr capability flag · 29386642

由 David Gibson 提交于 12月 07, 2017

We currently have some conditionals in the spapr device tree code to decide
whether or not to advertise the availability of the VMX (aka Altivec) and
VSX vector extensions to the guest, based on whether the guest cpu has
those features.

This can lead to confusion and subtle failures on migration, since it makes
a guest visible change based only on host capabilities.  We now have a
better mechanism for this, in spapr capabilities flags, which explicitly
depend on user options rather than host capabilities.

Rework the advertisement of VSX and VMX based on a new VSX capability.  We
no longer bother with a conditional for VMX support, because every CPU
that's ever been supported by the pseries machine type supports VMX.

NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on
availability of VSX in libc, so using cap-vsx=off may lead to a fatal
SIGILL in init.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>

29386642

spapr: Validate capabilities on migration · be85537d

由 David Gibson 提交于 12月 11, 2017

Now that the "pseries" machine type implements optional capabilities (well,
one so far) there's the possibility of having different capabilities
available at either end of a migration. Although arguably a user error,
it would be nice to catch this situation and fail as gracefully as we can.

This adds code to migrate the capabilities flags. These aren't pulled
directly into the destination's configuration since what the user has
specified on the destination command line should take precedence. However,
they are checked against the destination capabilities.

If the source was using a capability which is absent on the destination,
we fail the migration, since that could easily cause a guest crash or other
bad behaviour. If the source lacked a capability which is present on the
destination we warn, but allow the migration to proceed.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>

be85537d

spapr: Treat Hardware Transactional Memory (HTM) as an optional capability · ee76a09f

由 David Gibson 提交于 12月 11, 2017

This adds an spapr capability bit for Hardware Transactional Memory.  It is
enabled by default for pseries-2.11 and earlier machine types. with POWER8
or later CPUs (as it must be, since earlier qemu versions would implicitly
allow it).  However it is disabled by default for the latest pseries-2.12
machine type.

This means that with the latest machine type, HTM will not be available,
regardless of CPU, unless it is explicitly enabled on the command line.
That change is made on the basis that:

 * This way running with -M pseries,accel=tcg will start with whatever cpu
   and will provide the same guest visible model as with accel=kvm.
     - More specifically, this means existing make check tests don't have
       to be modified to use cap-htm=off in order to run with TCG

 * We hope to add a new "HTM without suspend" feature in the not too
   distant future which could work on both POWER8 and POWER9 cpus, and
   could be enabled by default.

 * Best guesses suggest that future POWER cpus may well only support the
   HTM-without-suspend model, not the (frankly, horribly overcomplicated)
   POWER8 style HTM with suspend.

 * Anecdotal evidence suggests problems with HTM being enabled when it
   wasn't wanted are more common than being missing when it was.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>

ee76a09f

spapr: Capabilities infrastructure · 33face6b

由 David Gibson 提交于 12月 08, 2017

Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.

In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.

Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.

We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.

This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.

The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.

This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>

33face6b

10 1月, 2018 5 次提交

spapr: Correct compatibility mode setting for hotplugged CPUs · 51f84465

由 David Gibson 提交于 1月 04, 2018

Currently the pseries machine sets the compatibility mode for the
guest's cpus in two places: 1) at machine reset and 2) after CAS
negotiation.

This means that if we set or negotiate a compatiblity mode, then
hotplug a cpu, the hotplugged cpu doesn't get the right mode set and
will incorrectly have the full native features.

To correct this, we set the compatibility mode on a cpu when it is
brought online with the 'start-cpu' RTAS call.  Given that we no
longer need to set the compatibility mode on all CPUs at machine
reset, so we change that to only set the mode for the boot cpu.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reported-by: NSatheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Tested-by: NSatheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

51f84465

hw/ppc: Remove the deprecated spapr-pci-vfio-host-bridge device · a7167668

由 Thomas Huth 提交于 1月 03, 2018

It's a deprecated dummy device since QEMU v2.6.0. That should have
been enough time to allow the users to update their scripts in case
they still use it, so let's remove this legacy code now.
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

a7167668

target/ppc: more use of the PPC_*() macros · a6a444a8

由 Cédric Le Goater 提交于 12月 22, 2017

Also introduce utilities to manipulate bitmasks (originaly from OPAL)
which be will be used in the model of the XIVE interrupt controller.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

a6a444a8

ppc/pnv: change powernv_ prefix to pnv_ for overall naming consistency · b168a138

由 Cédric Le Goater 提交于 12月 15, 2017

The 'pnv' prefix is now used for all and the routines populating the
device tree start with 'pnv_dt'. The handler of the PnvXScomInterface
is also renamed to 'dt_xscom' which should reflect that it is
populating the device tree under the 'xscom@' node of the chip.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

b168a138

spapr_pci: use warn_report() · 2b3db9dd

由 Greg Kurz 提交于 12月 18, 2017

These two are definitely warnings. Let's use the appropriate API.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

2b3db9dd

18 12月, 2017 4 次提交

hw/net/ne2000: extract ne2k-isa code from i386/pc to ne2000-isa.c · 489983d6

由 Philippe Mathieu-Daudé 提交于 10月 17, 2017

- add "hw/net/ne2000-isa.h"
- remove the old i386 dependency
Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NHervé Poussineau <hpoussin@reactos.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> [PPC]
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

489983d6

hw/timer/mc146818: rename rtc_init() -> mc146818_rtc_init() · 6c646a11

由 Philippe Mathieu-Daudé 提交于 10月 17, 2017

Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NHervé Poussineau <hpoussin@reactos.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

6c646a11

ppc: remove duplicated includes · 1945e6ab

由 Philippe Mathieu-Daudé 提交于 10月 17, 2017

applied using ./scripts/clean-includes

not needed since 7ebaf795Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

1945e6ab

hw: use "qemu/osdep.h" as first #include in source files · e9808d09

由 Philippe Mathieu-Daudé 提交于 10月 17, 2017

applied using ./scripts/clean-includes
Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

e9808d09

15 12月, 2017 16 次提交

spapr: don't initialize PATB entry if max-cpu-compat < power9 · 1481fe5f

由 Laurent Vivier 提交于 12月 14, 2017

if KVM is enabled and KVM capabilities MMU radix is available,
the partition table entry (patb_entry) for the radix mode is
initialized by default in ppc_spapr_reset().

It's a problem if we want to migrate the guest to a POWER8 host
while the kernel is not started to set the value to the one
expected for a POWER8 CPU.

The "-machine max-cpu-compat=power8" should allow to migrate
a POWER9 KVM host to a POWER8 KVM host, but because patb_entry
is set, the destination QEMU tries to enable radix mode on the
POWER8 host. This fails and cancels the migration:

    Process table config unsupported by the host
    error while loading state for instance 0x0 of device 'spapr'
    load of migration failed: Invalid argument

This patch doesn't set the PATB entry if the user provides
a CPU compatibility mode that doesn't support radix mode.
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

1481fe5f

spapr: Assume msi_nonbroken · 4f441474

由 David Gibson 提交于 12月 08, 2017

We conditionally adjust part of the guest device tree based on the
global msi_nonbroken flag.  However, the main machine type code
initializes msi_nonbroken to true and there's nothing that would set
it to false again.

So replace the test with an assert().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

4f441474

spapr: Rename machine init functions for clarity · bcb5ce08

由 David Gibson 提交于 12月 08, 2017

Machine objects have two init functions - the generic QOM level
instance_init which should only do static object initialization, and
the Machine specific MachineClass::init which does the actual
construction of the machine.

In spapr the functions implementing these two have names -
ppc_machine_initfn() and ppc_spapr_init() - which don't correspond closely
to either of those.  To prevent people (read, me) from confusing which is
which, rename them spapr_instance_init() and spapr_machine_init() to
make it clearer which is which.

While we're there rename ppc_spapr_reset() to spapr_machine_reset() to
match.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>

bcb5ce08

spapr_events: drop bogus cell from "interrupt-ranges" property · 638f2caa

由 Greg Kurz 提交于 12月 06, 2017

According to LoPAPR 1.1 B.6.12, the "/event-sources" node has an "interrupt-
ranges" property, the format of which is described in B.6.9.1.2 as follows:

“interrupt-ranges”
 Standard property name that defines the interrupt number(s) and range(s)
 handled by this unit.

 prop-encoded-array: List of (int-number, range) specifications.

 Int-number is encoded as with encode-int.
 Range is encoded as with encode-int.

 The first entry in this list shall contain the int-number associated with
 the first “reg” property entry. The int-num-ber is the value representing
 the interrupt source as would appear in the PowerPC External Interrupt
 Architecture XISR. The range shall be the number of sequential interrupt
 numbers which this unit can generate.

There's no such thing as a cell count at the end of the array, like the
one introduced by commit ffbb1705 in QEMU 2.8. It doesn't seem it had
any impact on existing guests and I couldn't find any related workaround
in linux. So, let's just drop the bogus lines.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

638f2caa

spapr: fix LSI interrupt specifiers in the device tree · bb2d8ab6

由 Greg Kurz 提交于 12月 06, 2017

LoPAPR 1.1 B.6.9.1.2 describes the "#interrupt-cells" property of the
PowerPC External Interrupt Source Controller node as follows:

“#interrupt-cells”

  Standard property name to define the number of cells in an interrupt-
  specifier within an interrupt domain.

  prop-encoded-array: An integer, encoded as with encode-int, that denotes
  the number of cells required to represent an interrupt specifier in its
  child nodes.

  The value of this property for the PowerPC External Interrupt option shall
  be 2. Thus all interrupt specifiers (as used in the standard “interrupts”
  property) shall consist of two cells, each containing an integer encoded
  as with encode-int. The first integer represents the interrupt number the
  second integer is the trigger code: 0 for edge triggered, 1 for level
  triggered.

This patch fixes the interrupt specifiers in the "interrupt-map" property
of the PHB node, that were setting the second cell to 8 (confusion with
IRQ_TYPE_LEVEL_LOW ?) instead of 1.

VIO devices and RTAS event sources use the same format for interrupt
specifiers: while here, we introduce a common helper to handle the
encoding details.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NCédric Le Goater <clg@kaod.org>
Tested-by: NCédric Le Goater <clg@kaod.org>
--
v3: - reference public LoPAPR instead of internal PAPR+ in changelog
    - change helper name to spapr_dt_xics_irq()

v2: - drop the erroneous changes to the "interrupts" prop in PCI device nodes
    - introduce a common helper to encode interrupt specifiers
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

bb2d8ab6

spapr: replace numa_get_node() with lookup in pc-dimm list · f47bd1c8

由 Igor Mammedov 提交于 12月 05, 2017

SPAPR is the last user of numa_get_node() and a bunch of
supporting code to maintain numa_info[x].addr list.

Get LMB node id from pc-dimm list, which allows to
remove ~80LOC maintaining dynamic address range
lookup list.

It also removes pc-dimm dependency on numa_[un]set_mem_node_id()
and makes pc-dimms a sole source of information about which
node it belongs to and removes duplicate data from global
numa_info.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

f47bd1c8

spapr: introduce a spapr_qirq() helper · 77183755

由 Cédric Le Goater 提交于 12月 01, 2017

xics_get_qirq() is only used by the sPAPR machine. Let's move it there
and change its name to reflect its scope. It will be useful for XIVE
support which will use its own set of qirqs.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

77183755

spapr: introduce a spapr_irq_set_lsi() helper · 9e7dc5fc

由 Cédric Le Goater 提交于 12月 01, 2017

It will make synchronisation easier with the XIVE interrupt mode when
available. The 'irq' parameter refers to the global IRQ number space.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

9e7dc5fc

spapr: move the IRQ allocation routines under the machine · 60c6823b

由 Cédric Le Goater 提交于 12月 01, 2017

Also change the prototype to use a sPAPRMachineState and prefix them
with spapr_irq_. It will let us synchronise the IRQ allocation with
the XIVE interrupt mode when available.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

60c6823b

ppc/xics: assign of the CPU 'intc' pointer under the core · ed0c37ee

由 Cédric Le Goater 提交于 12月 01, 2017

The 'intc' pointer of the CPU references the interrupt presenter in
the XICS interrupt mode. When the XIVE interrupt mode is available and
activated, the machine will need to reassign this pointer to reflect
the change.

Moving this assignment under the realize routine of the CPU will ease
the process when the interrupt mode is toggled.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ed0c37ee

ppc/xics: introduce an icp_create() helper · 4f7a47be

由 Cédric Le Goater 提交于 12月 01, 2017

The sPAPR and the PowerNV core objects create the interrupt presenter
object of the CPUs in a very similar way. Let's provide a common
routine in which we use the presenter 'type' as a child identifier.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

4f7a47be

spapr/rtas: do not reset the MSR in stop-self command · 3fe4f0fc

由 Cédric Le Goater 提交于 11月 24, 2017

When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.

The CPU is now also protected from the decrementer interrupt by the
LPCR:PECE* bits which are disabled in the 'stop-self' RTAS
call. Reseting the MSR is pointless.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

3fe4f0fc

spapr/rtas: fix reboot of a a SMP TCG guest · d6322252

由 Cédric Le Goater 提交于 11月 24, 2017

Just like for hot unplug CPUs, when a guest is rebooted, the secondary
CPUs can be awaken by the decrementer and start entering SLOF at the
same time the boot CPU is.

To be safe, let's disable on the secondaries all the exceptions which
can cause an exit while the CPU is in power-saving mode.

Based on previous work from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

d6322252

spapr/rtas: disable the decrementer interrupt when a CPU is unplugged · 9a94ee5b

由 Cédric Le Goater 提交于 11月 24, 2017

When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.

If the DECR timer fires after 'stop-self' is called and before the CPU
'stop' state is reached, the nearly-dead CPU will have some work to do
and the guest will crash. This case happens very frequently with the
not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
occasionally fired but after 'stop' state, so no work is to be done
and the guest survives.

I suspect there is a race between the QEMU mainloop triggering the
timers and the TCG CPU thread but I could not quite identify the root
cause. To be safe, let's disable in the LPCR all the exceptions which
can cause an exit while the CPU is in power-saving mode and reenable
them when the CPU is started.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

9a94ee5b

e500: name openpic and pci host bridge · e75ce32a

由 Michael Davidsaver 提交于 11月 19, 2017

Signed-off-by: NMichael Davidsaver <mdavidsaver@gmail.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

e75ce32a

spapr_cpu_core: instantiate CPUs separately · 94ad93bd

由 Greg Kurz 提交于 11月 20, 2017

The current code assumes that only the CPU core object holds a
reference on each individual CPU object, and happily frees their
allocated memory when the core is unrealized. This is dangerous
as some other code can legitimely keep a pointer to a CPU if it
calls object_ref(), but it would end up with a dangling pointer.

Let's allocate all CPUs with object_new() and let QOM free them
when their reference count reaches zero. This greatly simplify the
code as we don't have to fiddle with the instance size anymore.
Signed-off-by: NGreg Kurz <groug@kaod.org>
Acked-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

94ad93bd