提交 · eabff1582032b662fa58e6e880ca7399f1b78369 · openeuler / qemu

15 11月, 2016 6 次提交

ppc/pnv: Fix fatal bug on 32-bit hosts · 27d9ffd4

由 David Gibson 提交于 11月 14, 2016

If the pnv machine type is compiled on a 32-bit host, the unsigned long
(host) type is 32-bit.  This means that the hweight_long() used to
calculate the number of allowed cores only considers the low 32 bits of
the cores_mask variable, and can thus return 0 in some circumstances.

This corrects the bug.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Suggested-by: NRichard Henderson <rth@twiddle.net>
[clg: replaced hweight_long() by ctpop64() ]
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

27d9ffd4

ppc/pnv: fix xscom address translation for POWER9 · f81e5512

由 Cédric Le Goater 提交于 11月 14, 2016

High addresses can overflow the uint32_t pcba variable after the 8byte
shift.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

f81e5512

ppc/pnv: add a 'xscom_core_base' field to PnvChipClass · ad521238

由 Cédric Le Goater 提交于 11月 14, 2016

The XSCOM addresses for the core registers are encoded in a slightly
different way on POWER8 and POWER9.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ad521238

spapr: Fix migration of PCI host bridges from qemu-2.7 · 9b54ca0b

由 David Gibson 提交于 11月 15, 2016

daa23699 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from
qemu-2.7 to the current version. It split the device's MMIO window into
two pieces for 32-bit and 64-bit MMIO.

The patch included backwards compatibility code to convert the old property
into the new format. However, the property value was also transferred in
the migration stream and compared with a (probably unwise) VMSTATE_EQUAL.
So, the "raw" value from 2.7 is compared to the new style converted value
from (pre-)2.8 giving a mismatch and migration failure.

Although it would be technically possible to fix this in a way allowing
backwards migration, that would leave an ugly legacy around indefinitely.
This patch takes the simpler approach of bumping the migration version,
dropping the unwise VMSTATE_EQUAL (and some equally unwise ones around it)
and ignoring them on an incoming migration.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

9b54ca0b

ppc/pnv: fix compile breakage on old gcc · ec575aa0

由 Cédric Le Goater 提交于 11月 07, 2016

PnvChip is defined twice and this can confuse old compilers :

  CC      ppc64-softmmu/hw/ppc/pnv_xscom.o
In file included from qemu.git/hw/ppc/pnv.c:29:
qemu.git/include/hw/ppc/pnv.h:60: error: redefinition of typedef ‘PnvChip’
qemu.git/include/hw/ppc/pnv_xscom.h:24: note: previous declaration of ‘PnvChip’ was here
make[1]: *** [hw/ppc/pnv.o] Error 1
make[1]: *** Waiting for unfinished jobs....
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ec575aa0

powernv: CPU compatibility modes don't make sense for powernv · 8bd9530e

由 David Gibson 提交于 11月 01, 2016

powernv has some code (derived from the spapr equivalent) used in device
tree generation which depends on the CPU's compatibility mode / logical
PVR. However, compatibility modes don't make sense on powernv - at least
not as a property controlled by the host - because the guest in powernv
has full hypervisor level access to the virtual system, and so owns the
PCR (Processor Compatibility Register) which implements compatiblity modes.

Note: the new logic doesn't take into account kvmppc_smt_threads() like the
old version did. However, if core->nr_threads exceeds kvmppc_smt_threads()
then things will already be broken and clamping the value in the device
tree isn't going to save us.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NGreg Kurz <groug@kaod.org>
Reviewed-by: NThomas Huth <thuth@redhat.com>

8bd9530e

31 10月, 2016 1 次提交

*_run_on_cpu: introduce run_on_cpu_data type · 14e6fe12

由 Paolo Bonzini 提交于 10月 31, 2016

This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

14e6fe12

28 10月, 2016 32 次提交

clean-up: removed duplicate #includes · 814bb12a

由 Anand J 提交于 10月 21, 2016

Some files contain multiple #includes of the same header file.
Removed most of those unnecessary duplicate entries using
scripts/clean-includes.
Reviewed-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NAnand J <anand.indukala@gmail.com>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

814bb12a

spapr: Memory hot-unplug support · cf632463

由 Bharata B Rao 提交于 10月 26, 2016

Add support to hot remove pc-dimm memory devices.

Since we're introducing a machine-level unplug_request hook, we also
had handling for CPU unplug there as well to ensure CPU unplug
continues to work as it did before.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
* add hooks to CAS/cmdline enablement of hotplug ACR support
* add hook for CPU unplug
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

cf632463

spapr: use count+index for memory hotplug · 79b78a6b

由 Michael Roth 提交于 10月 26, 2016

Commit 0a417869:

    spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type

dropped per-DRC/per-LMB hotplugs event in favor of a bulk add via a
single LMB count value. This was to avoid overrunning the guest EPOW
event queue with hotplug events. This works fine, but relies on the
guest exhaustively scanning for pluggable LMBs to satisfy the
requested count by issuing rtas-get-sensor(DR_ENTITY_SENSE, ...) calls
until all the LMBs associated with the DIMM are identified.

With newer support for dedicated hotplug event source, this queue
exhaustion is no longer as much of an issue due to implementation
details on the guest side, but we still try to avoid excessive hotplug
events by now supporting both a count and a starting index to avoid
unecessary work. This patch makes use of that approach when the
capability is available.

Cc: bharata@linux.vnet.ibm.com
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

79b78a6b

spapr: Add DRC count indexed hotplug identifier type · afdbd403

由 Bharata B Rao 提交于 10月 26, 2016

Add support for DRC count indexed hotplug ID type which is primarily
needed for memory hot unplug. This type allows for specifying the
number of DRs that should be plugged/unplugged starting from a given
DRC index.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
* updated rtas_event_log_v6_hp to reflect count/index field ordering
  used in PAPR hotplug ACR
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

afdbd403

spapr: add hotplug interrupt machine options · f6229214

由 Michael Roth 提交于 10月 26, 2016

This adds machine options of the form:

  -machine pseries,modern-hotplug-events=true
  -machine pseries,modern-hotplug-events=false

If false, QEMU will force the use of "legacy" style hotplug events,
which are surfaced through EPOW events instead of a dedicated
hot plug event source, and lack certain features necessary, mainly,
for memory unplug support.

If true, QEMU will enable support for "modern" dedicated hot plug
event source. Note that we will still default to "legacy" style unless
the guest advertises support for the "modern" hotplug events via
ibm,client-architecture-support hcall during early boot.

For pseries-2.7 and earlier we default to false, for newer machine
types we default to true.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

f6229214

spapr_events: add support for dedicated hotplug event source · ffbb1705

由 Michael Roth 提交于 10月 26, 2016

Hotplug events were previously delivered using an EPOW interrupt
and were queued by linux guests into a circular buffer. For traditional
EPOW events like shutdown/resets, this isn't an issue, but for hotplug
events there are cases where this buffer can be exhausted, resulting
in the loss of hotplug events, resets, etc.

Newer-style hotplug event are delivered using a dedicated event source.
We enable this in supported guests by adding standard an additional
event source in the guest device-tree via /event-sources, and, if
the guest advertises support for the newer-style hotplug events,
using the corresponding interrupt to signal the available of
hotplug/unplug events.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ffbb1705

spapr: improve ibm,architecture-vec-5 property handling · 417ece33

由 Michael Roth 提交于 10月 24, 2016

ibm,architecture-vec-5 is supposed to encode all option vector 5 bits
negotiated between platform/guest. Currently we hardcode this property
in the boot-time device tree to advertise a single negotiated
capability, "Form 1" NUMA Affinity, regardless of whether or not CAS
has been invoked or that capability has actually been negotiated.

Improve this by generating ibm,architecture-vec-5 based on the full
set of option vector 5 capabilities negotiated via CAS.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

417ece33

spapr: add option vector handling in CAS-generated resets · 6787d27b

由 Michael Roth 提交于 10月 24, 2016

In some cases, ibm,client-architecture-support calls can fail. This
could happen in the current code for situations where the modified
device tree segment exceeds the buffer size provided by the guest
via the call parameters. In these cases, QEMU will reset, allowing
an opportunity to regenerate the device tree from scratch via
boot-time handling. There are potentially other scenarios as well,
not currently reachable in the current code, but possible in theory,
such as cases where device-tree properties or nodes need to be removed.

We currently don't handle either of these properly for option vector
capabilities however. Instead of carrying the negotiated capability
beyond the reset and creating the boot-time device tree accordingly,
we start from scratch, generating the same boot-time device tree as we
did prior to the CAS-generated and the same device tree updates as we
did before. This could (in theory) cause us to get stuck in a reset
loop. This hasn't been observed, but depending on the extensiveness
of CAS-induced device tree updates in the future, could eventually
become an issue.

Address this by pulling capability-related device tree
updates resulting from CAS calls into a common routine,
spapr_dt_cas_updates(), and adding an sPAPROptionVector*
parameter that allows us to test for newly-negotiated capabilities.
We invoke it as follows:

1) When ibm,client-architecture-support gets called, we
call spapr_dt_cas_updates() with the set of capabilities
added since the previous call to ibm,client-architecture-support.
For the initial boot, or a system reset generated by something
other than the CAS call itself, this set will consist of *all*
options supported both the platform and the guest. For calls
to ibm,client-architecture-support immediately after a CAS-induced
reset, we call spapr_dt_cas_updates() with only the set
of capabilities added since the previous call, since the other
capabilities will have already been addressed by the boot-time
device-tree this time around. In the unlikely event that
capabilities are *removed* since the previous CAS, we will
generate a CAS-induced reset. In the unlikely event that we
cannot fit the device-tree updates into the buffer provided
by the guest, well generate a CAS-induced reset.

2) When a CAS update results in the need to reset the machine and
include the updates in the boot-time device tree, we call the
spapr_dt_cas_updates() using the full set of negotiated
capabilities as part of the reset path. At initial boot, or after
a reset generated by something other than the CAS call itself,
this set will be empty, resulting in what should be the same
boot-time device-tree as we generated prior to this patch. For
CAS-induced reset, this routine will be called with the full set of
capabilities negotiated by the platform/guest in the previous
CAS call, which should result in CAS updates from previous call
being accounted for in the initial boot-time device tree.
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
[dwg: Changed an int -> bool conversion to be more explicit]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

6787d27b

spapr_hcall: use spapr_ovec_* interfaces for CAS options · facdb8b6

由 Michael Roth 提交于 10月 24, 2016

Currently we access individual bytes of an option vector via
ldub_phys() to test for the presence of a particular capability
within that byte. Currently this is only done for the "dynamic
reconfiguration memory" capability bit. If that bit is present,
we pass a boolean value to spapr_h_cas_compose_response()
to generate a modified device tree segment with the additional
properties required to enable this functionality.

As more capability bits are added, will would need to modify the
code to add additional option vector accesses and extend the
param list for spapr_h_cas_compose_response() to include similar
boolean values for these parameters.

Avoid this by switching to spapr_ovec_* helpers so we can do all
the parsing in one shot and then test for these additional bits
within spapr_h_cas_compose_response() directly.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

facdb8b6

spapr_ovec: initial implementation of option vector helpers · b20b7b7a

由 Michael Roth 提交于 10月 24, 2016

PAPR guests advertise their capabilities to the platform by passing
an ibm,architecture-vec structure via an
ibm,client-architecture-support hcall as described by LoPAPR v11,
B.6.2.3. during early boot.

Using this information, the platform enables the capabilities it
supports, then encodes a subset of those enabled capabilities (the
5th option vector of the ibm,architecture-vec structure passed to
ibm,client-architecture-support) into the guest device tree via
"/chosen/ibm,architecture-vec-5".

The logical format of these these option vectors is a bit-vector,
where individual bits are addressed/documented based on the byte-wise
offset from the beginning of the bit-vector, followed by the bit-wise
index starting from the byte-wise offset. Thus the bits of each of
these bytes are stored in reverse order. Additionally, the first
byte of each option vector is encodes the length of the option vector,
so byte offsets begin at 1, and bit offset at 0.

This is not very intuitive for the purposes of mapping these bits to
a particular documented capability, so this patch introduces a set
of abstractions that encapsulate the work of parsing/encoding these
options vectors and testing for individual capabilities.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Tweaked double-include protection to not trigger a checkpatch
 false positive]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

b20b7b7a

pseries: Remove spapr_create_fdt_skel() · 398a0bd5

由 David Gibson 提交于 10月 20, 2016

For historical reasons construction of the guest device tree in spapr is
divided between spapr_create_fdt_skel() which is called at init time, and
spapr_build_fdt() which runs at reset time. Over time, more and more
things have needed to be moved to reset time.

Previous cleanups mean the only things left in spapr_create_fdt_skel() are
the properties of the root node itself. Finish consolidating these two
parts of device tree construction, by moving this to the start of
spapr_build_fdt(), and removing spapr_create_fdt_skel() entirely.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

398a0bd5

pseries: Consolidate construction of /vdevice device tree node · bf5a6696

由 David Gibson 提交于 10月 20, 2016

Construction of the /vdevice node (and its children) is divided between
spapr_create_fdt_skel() (at init time), which creates the base node, and
spapr_populate_vdevice() (at reset time) which creates the nodes for each
individual virtual device.

This consolidates both into a single function called from
spapr_build_fdt().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

bf5a6696

pseries: Move /hypervisor node construction to fdt_build_fdt() · fca5f2dc

由 David Gibson 提交于 10月 20, 2016

Currently the /hypervisor device tree node is constructed in
spapr_create_fdt_skel().  As part of consolidating device tree construction
to reset time, move it to a function called from spapr_build_fdt().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

fca5f2dc

pseries: Move /event-sources construction to spapr_build_fdt() · ffb1e275

由 David Gibson 提交于 10月 20, 2016

The /event-sources device tree node is built from spapr_create_fdt_skel().
As part of consolidating device tree construction to reset time, this moves
it to spapr_build_fdt().
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

ffb1e275

pseries: Consolidate construction of /rtas device tree node · 3f5dabce

由 David Gibson 提交于 10月 20, 2016

For historical reasons construction of the /rtas node in the device
tree (amongst others) is split into several places.  In particular
it's split between spapr_create_fdt_skel(), spapr_build_fdt() and
spapr_rtas_device_tree_setup().

In fact, as well as adding the actual RTAS tokens to the device tree,
spapr_rtas_device_tree_setup() just adds the ibm,lrdr-capacity
property, which despite going in the /rtas node, doesn't have a lot to
do with RTAS.

This patch consolidates the code constructing /rtas together into a new
spapr_dt_rtas() function.  spapr_rtas_device_tree_setup() is renamed to
spapr_dt_rtas_tokens() and now only adds the token properties.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

3f5dabce

pseries: Consolidate construction of /chosen device tree node · 7c866c6a

由 David Gibson 提交于 10月 24, 2016

For historical reasons, building the /chosen node in the guest device tree
is split across several places and includes both parts which write the DT
sequentially and others which use random access functions.

This patch consolidates construction of the node into one place, using
random access functions throughout.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

7c866c6a

pseries: Move construction of /interrupt-controller fdt node · 9b9a1908

由 David Gibson 提交于 10月 20, 2016

Currently the device tree node for the XICS interrupt controller is in
spapr_create_fdt_skel(). As part of consolidating device tree construction
to reset time, this moves it to a function called from spapr_build_fdt().

In addition we move the actual code into hw/intc/xics_spapr.c with the
rest of the PAPR specific interrupt controller code.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

9b9a1908

pseries: Consolidate RTAS loading · 2cac78c1

由 David Gibson 提交于 10月 20, 2016

At each system reset, the pseries machine needs to load RTAS (the runtime
portion of the guest firmware) into the VM.  This means copying
the actual RTAS code into guest memory, and also updating the device
tree so that the guest OS and boot firmware can locate it.

For historical reasons the copy and update to the device tree were in
different parts of the code.  This cleanup brings them both together in
an spapr_load_rtas() function.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

2cac78c1

pseries: Move adding of fdt reserve map entries · cf6e5223

由 David Gibson 提交于 10月 20, 2016

The flattened device tree passed to pseries guests contains a list of
reserved memory areas.  Currently we construct this list early in
spapr_create_fdt_skel() as we sequentially write the fdt.

This will be inconvenient for upcoming cleanups, so this patch moves
the reserve map changes to the end of fdt construction.  This changes
fdt_add_reservemap_entry() calls - which work when writing the fdt
sequentially to fdt_add_mem_rsv() calls used when altering the fdt in
random access mode.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

cf6e5223

pseries: Make spapr_create_fdt_skel() get information from machine state · a19f7fb0

由 David Gibson 提交于 10月 20, 2016

Currently spapr_create_fdt_skel() takes a bunch of individual parameters
for various things it will put in the device tree.  Some of these can
already be taken directly from sPAPRMachineState.  This patch alters it so
that all of them can be taken from there, which will allow this code to
be moved away from its current caller in future.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

a19f7fb0

pseries: Remove rtas_addr and fdt_addr fields from machinestate · cae172ab

由 David Gibson 提交于 10月 20, 2016

These values are used only within ppc_spapr_reset(), so just change them
to local variables.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

cae172ab

pseries: Split device tree construction from device tree load · 997b6cfc

由 David Gibson 提交于 10月 25, 2016

spapr_finalize_fdt() both finishes building the device tree for the guest
and loads it into guest memory.  For future cleanups, it's going to be
more convenient to do these two things separately.  The loading portion is
pretty trivial, so we move it inline into the caller, ppc_spapr_reset().

We also rename spapr_finalize_fdt(), because the current name is going to
become inaccurate.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

997b6cfc

ppc/pnv: add a ISA bus · 3495b6b6

由 Cédric Le Goater 提交于 10月 22, 2016

As Qemu only supports a single instance of the ISA bus, we use the LPC
controller of chip 0 to create one and plug in a couple of useful
devices, like an UART and RTC. An IPMI BT device, which is also an ISA
device, can be defined on the command line to connect an external BMC.
That is for later.

The PowerNV machine now has a console. Skiboot should load a kernel
and jump into it but execution will stop quite early because we lack a
model for the native XICS controller for the moment :

    [    0.000000] NR_IRQS:512 nr_irqs:512 16
    [    0.000000] XICS: Cannot find a Presentation Controller !
    [    0.000000] ------------[ cut here ]------------
    [    0.000000] WARNING: at arch/powerpc/platforms/powernv/setup.c:81
    ...
    [    0.000000] NIP [c00000000079d65c] pnv_init_IRQ+0x30/0x44

You can still do a few things under xmon.

Based on previous work from :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
[dwg: Trivial fix for a change in the serial_hds_isa_init() interface]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

3495b6b6

ppc/pnv: add a LPC controller · a3980bf5

由 Benjamin Herrenschmidt 提交于 10月 22, 2016

The LPC (Low Pin Count) interface on a POWER8 is made accessible to
the system through the ADU (XSCOM interface). This interface is part
of set of units connected together via a local OPB (On-Chip Peripheral
Bus) which act as a bridge between the ADU and the off chip LPC
endpoints, like external flash modules.

The most important units of this OPB are :
 - OPB Master: contains the ADU slave logic, a set of internal
   registers and the logic to control the OPB.
 - LPCHC (LPC HOST Controller): which implements a OPB Slave, a set of
   internal registers and the LPC HOST Controller to control the LPC
   interface.

Four address spaces are provided to the ADU :
 - LPC Bus Firmware Memory
 - LPC Bus Memory
 - LPC Bus I/O (ISA bus)
 - and the registers for the OPB Master and the LPC Host Controller

On POWER8, an intermediate hop is necessary to reach the OPB, through
a unit called the ECCB. OPB commands are simply mangled in ECCB write
commands.

On POWER9, the OPB master address space can be accessed via MMIO. The
logic is same but the code will be simpler as the XSCOM and ECCB hops
are not necessary anymore.

This version of the LPC controller model doesn't yet implement support
for the SerIRQ deserializer present in the Naples version of the chip
though some preliminary work is there.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - ported on latest PowerNV patchset
      - changed the XSCOM interface to fit new model
      - QOMified the model
      - moved the ISA hunks in another patch
      - removed printf logging
      - added a couple of UNIMP logging
      - rewrote commit log ]
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

a3980bf5

ppc/pnv: add XSCOM handlers to PnvCore · 24ece072

由 Cédric Le Goater 提交于 10月 22, 2016

Now that we are using real HW ids for the cores in PowerNV chips, we
can route the XSCOM accesses to them. We just need to attach a
specific XSCOM memory region to each core in the appropriate window
for the core number.

To start with, let's install the DTS (Digital Thermal Sensor) handlers
which should return 38°C for each core.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

24ece072

ppc/pnv: add XSCOM infrastructure · 967b7523

由 Cédric Le Goater 提交于 10月 22, 2016

On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves
as a backbone to connect different units of the system. The host
firmware connects to the PIB through a bridge unit, the
Alter-Display-Unit (ADU), which gives him access to all the chiplets
on the PCB network (Pervasive Connect Bus), the PIB acting as the root
of this network.

XSCOM (serial communication) is the interface to the sideband bus
provided by the POWER8 pervasive unit to read and write to chiplets
resources. This is needed by the host firmware, OPAL and to a lesser
extent, Linux. This is among others how the PCI Host bridges get
configured at boot or how the LPC bus is accessed.

To represent the ADU of a real system, we introduce a specific
AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The
translation of an XSCOM address into a PCB register address is
slightly different between the P9 and the P8. This is handled before
the dispatch using a 8byte alignment for all.

To customize the device tree, a QOM InterfaceClass, PnvXScomInterface,
is provided with a populate() handler. The chip populates the device
tree by simply looping on its children. Therefore, each model needing
custom nodes should not forget to declare itself as a child at
instantiation time.

Based on previous work done by :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NCédric Le Goater <clg@kaod.org>
[dwg: Added cpu parameter to xscom_complete()]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

967b7523

ppc/pnv: add a PnvCore object · d2fd9612

由 Cédric Le Goater 提交于 10月 22, 2016

This is largy inspired by sPAPRCPUCore with some simplification, no
hotplug for instance. A set of PnvCore objects is added to the PnvChip
and the device tree is populated looping on these cores.

Real HW cpu ids are now generated depending on the chip cpu model, the
chip id and a core mask. The id is propagated to the CPU object, using
properties, to set the SPR_PIR (Processor Identification Register)
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

d2fd9612

ppc/pnv: add a PIR handler to PnvChip · 631adaff

由 Cédric Le Goater 提交于 10月 22, 2016

The Processor Identification Register (PIR) is a register that holds a
processor identifier which is used for bus transactions (XSCOM) and
for processor differentiation in multiprocessor systems. It also used
in the interrupt vector entries (IVE) to identify the thread serving
the interrupts.

P9 and P8 have some differences in the CPU PIR encoding.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

631adaff

ppc/pnv: add a core mask to PnvChip · 397a79e7

由 Cédric Le Goater 提交于 10月 22, 2016

This will be used to build real HW ids for the cores and enforce some
limits on the available cores per chip.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

397a79e7

ppc/pnv: add a PnvChip object · e997040e

由 Cédric Le Goater 提交于 10月 22, 2016

This is is an abstraction of a POWER8 chip which is a set of cores
plus other 'units', like the pervasive unit, the interrupt controller,
the memory controller, the on-chip microcontroller, etc. The whole can
be seen as a socket. It depends on a cpu model and its characteristics:
max cores and specific inits are defined in a PnvChipClass.

We start with an near empty PnvChip with only a few cpu constants
which we will grow in the subsequent patches with the controllers
required to run the system.

The Chip CFAM (Common FRU Access Module) ID gives the model of the
chip and its version number. It is generally the first thing firmwares
fetch, available at XSCOM PCB address 0xf000f, to start initialization.
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

e997040e

ppc/pnv: add skeleton PowerNV platform · 9e933f4a

由 Benjamin Herrenschmidt 提交于 10月 22, 2016

The goal is to emulate a PowerNV system at the level of the skiboot
firmware, which loads the OS and provides some runtime services. Power
Systems have a lower firmware (HostBoot) that does low level system
initialization, like DRAM training. This is beyond the scope of what
qemu will address in a PowerNV guest.

No devices yet, not even an interrupt controller. Just to get started,
some RAM to load the skiboot firmware, the kernel and initrd. The
device tree is fully created in the machine reset op.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - replaced fprintf by error_report
      - used a common definition of _FDT macro
      - removed VMStateDescription as migration is not yet supported
      - added IBM Copyright statements
      - reworked kernel_filename handling
      - merged PnvSystem and sPowerNVMachineState
      - removed PHANDLE_XICP
      - added ppc_create_page_sizes_prop helper
      - removed nmi support
      - removed kvm support
      - updated powernv machine to version 2.8
      - removed chips and cpus, They will be provided in another patches
      - added a machine reset routine to initialize the device tree (also)
      - french has a squelette and english a skeleton.
      - improved commit log.
      - reworked prototypes parameters
      - added a check on the ram size (thanks to Michael Ellerman)
      - fixed chip-id cell
      - changed MAX_CPUS to 2048
      - simplified memory node creation to one node only
      - removed machine version
      - rewrote the device tree creation with the fdt "rw" routines
      - s/sPowerNVMachineState/PnvMachineState/
      - etc.]
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

9e933f4a

spapr_pci: advertise explicit numa IDs even when there's 1 node · 4bcfa56c

由 Michael Roth 提交于 10月 18, 2016

With the addition of "numa_node" properties for PHBs we began
advertising NUMA affinity in cases where nb_numa_nodes > 1.

Since the default on the guest side is to make no assumptions about
PHB NUMA affinity (defaulting to -1), there is still a valid use-case
for explicitly defining a PHB's NUMA affinity even when there's just
one node. In particular, some workloads make faulty assumptions about
/sys/bus/pci/<devid>/numa_node being >= 0, warranting the use of
this property as a workaround even if there's just 1 PHB or NUMA
node.

Enable this use-case by always advertising the PHB's NUMA affinity
if "numa_node" has been explicitly set.

We could achieve this by relaxing the check to simply be
nb_numa_nodes > 0, but even safer would be to check
numa_info[nodeid].present explicitly, and to fail at start time
for cases where it does not exist.

This has an additional affect of no longer advertising PHB NUMA
affinity unconditionally if nb_numa_nodes > 1 and "numa_node"
property is unset/-1, but since the default value on the guest
side for each PHB is also -1, the behavior should be the same for
that situation. We could still retain the old behavior if desired,
but the decision seems arbitrary, so we take the simpler route.

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Shivaprasad G. Bhat <shivapbh@in.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

4bcfa56c

25 10月, 2016 1 次提交

Increase MAX_CPUMASK_BITS from 255 to 288 · 079019f2

由 Igor Mammedov 提交于 10月 19, 2016

so that it would be possible to increase maxcpus limit
for x86 target. Keep spapr/virt_arm at limit they used
to have 255.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

079019f2