提交 · a7bf30342e6a7924132a5c70047928261d3c7e42 · openeuler / qemu

24 9月, 2015 22 次提交

hw/intc: Initial implementation of vGICv3 · a7bf3034

由 Pavel Fedin 提交于 9月 24, 2015

This is the initial version of KVM-accelerated GICv3 support.
State load and save are not yet supported, live migration is
not possible.

In order to get correct class name in a simpler way, gicv3_class_name()
function is implemented, similar to gic_class_name().
Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Tested-by: NAshok kumar <ashoks@broadcom.com>
Message-id: 69d8f01d14994d7a1a140e96aef59fd332d02293.1441784344.git.p.fedin@samsung.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

a7bf3034

intc/gic: Extract some reusable vGIC code · 4b3cfe72

由 Pavel Fedin 提交于 9月 24, 2015

Some functions previously used only by vGICv2 are useful also for vGICv3
implementation. Untie them from GICState and make accessible from within
other modules:
- kvm_arm_gic_set_irq()
- kvm_gic_supports_attr() - moved to common code and renamed to
  kvm_device_check_attr()
- kvm_gic_access() - turned into GIC-independent kvm_device_access().
  Data pointer changed to void * because some GICv3 registers are
  64-bit wide

Some of these changes are not used right now, but they will be helpful for
implementing live migration.

Actually kvm_dist_get() and kvm_dist_put() could also be made reusable, but
they would require two extra parameters (s->dev_fd and s->num_cpu) as well as
lots of typecasts of 's' to DeviceState * and back to GICState *. This makes
the code very ugly so i decided to stop at this point. I tried also an
approach with making a base class for all possible GICs, but it would contain
only three variables (dev_fd, cpu_num and irq_num), and accessing them through
the rest of the code would be again tedious (either ugly casts or qemu-style
separate object pointer). So i disliked it too.
Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
Tested-by: NAshok kumar <ashoks@broadcom.com>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Message-id: 2ef56d1dd64ffb75ed02a10dcdaf605e5b8ff4f8.1441784344.git.p.fedin@samsung.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

4b3cfe72

hw/intc: Implement GIC-500 base class · ff8f06ee

由 Shlomo Pongratz 提交于 9月 24, 2015

This class is to be used by both software and KVM implementations of GICv3

Currently it is mostly a placeholder, but in future it is supposed to hold
qemu's representation of GICv3 state, which is necessary for migration.

The interface of this class is fully compatible with GICv2 one. This is
done in order to simplify integration with existing code.
Signed-off-by: NShlomo Pongratz <shlomo.pongratz@huawei.com>
Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
Reviewed-by: NEric Auger <eric.auger@linaro.org>
Tested-by: NAshok kumar <ashoks@broadcom.com>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Message-id: aff8baaee493cdcab0694b4a1d4dd5ff27c37ed2.1441784344.git.p.fedin@samsung.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

ff8f06ee

vfio/pci: Add emulated PCI IDs · 89dcccc5

由 Alex Williamson 提交于 9月 23, 2015

Specifying an emulated PCI vendor/device ID can be useful for testing
various quirk paths, even though the behavior and functionality of
the device with bogus IDs is fully unsupportable. We need to use a
uint32_t for the vendor/device IDs, even though the registers
themselves are only 16-bit in order to be able to determine whether
the value is valid and user set.

The same support is added for subsystem vendor/device ID, though these
have the possibility of being useful and supported for more than a
testing tool. An emulated platform might want to impose their own
subsystem IDs or at least hide the physical subsystem ID. Windows
guests will often reinstall drivers due to a change in subsystem IDs,
something that VM users may want to avoid. Of course careful
attention would be required to ensure that guest drivers do not rely
on the subsystem ID as a basis for device driver quirks.

All of these options are added using the standard experimental option
prefix and should not be considered stable.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

89dcccc5

vfio/pci: Cache vendor and device ID · ff635e37

由 Alex Williamson 提交于 9月 23, 2015

Simplify access to commonly referenced PCI vendor and device ID by
caching it on the VFIOPCIDevice struct.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

ff635e37

vfio/pci: Move AMD device specific reset to quirks · c9c50009

由 Alex Williamson 提交于 9月 23, 2015

This is just another quirk, for reset rather than affecting memory
regions.  Move it to our new quirks file.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

c9c50009

A
vfio/pci: Remove old config window and mirror quirks · 958d5534
由 Alex Williamson 提交于 9月 23, 2015
```
These are now unused.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
```
958d5534

vfio/pci: Config mirror quirk · 0d38fb1c

由 Alex Williamson 提交于 9月 23, 2015

Re-implement our mirror quirk using the new infrastructure.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

0d38fb1c

vfio/pci: Config window quirks · 0e54f24a

由 Alex Williamson 提交于 9月 23, 2015

Config windows make use of an address register and a data register.
In VGA cards, these are often used to provide real mode code in the
BIOS an easy way to access MMIO registers since the window often
resides in an I/O port register.  When the MMIO register has a mirror
of PCI config space, we need to trap those accesses and redirect them
to emulated config space.

The previous version of this functionality made use of a single
MemoryRegion and single match address.  This version uses separate
MemoryRegions for each of the address and data registers and allows
for multiple match addresses.  This is useful for Nvidia cards which
have two ranges which index into PCI config space.

The previous implementation is left for the follow-on patch for a more
reviewable diff.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

0e54f24a

vfio/pci: Rework RTL8168 quirk · 954258a5

由 Alex Williamson 提交于 9月 23, 2015

Another rework of this quirk, this time to update to the new quirk
structure.  We can handle the address and data registers with
separate MemoryRegions and a quirk specific data structure, making the
code much more understandable.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

954258a5

vfio/pci: Cleanup Nvidia 0x3d0 quirk · 6029a424

由 Alex Williamson 提交于 9月 23, 2015

The Nvidia 0x3d0 quirk makes use of a two separate registers and gives
us our first chance to make use of separate memory regions for each to
simplify the code a bit.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

6029a424

vfio/pci: Cleanup ATI 0x3c3 quirk · b946d286

由 Alex Williamson 提交于 9月 23, 2015

This is an easy quirk that really doesn't need a data structure if
its own.  We can pass vdev as the opaque data and access to the
MemoryRegion isn't required.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

b946d286

vfio/pci: Foundation for new quirk structure · 8c4f2348

由 Alex Williamson 提交于 9月 23, 2015

VFIOQuirk hosts a single memory region and a fixed set of data fields
that try to handle all the quirk cases, but end up making those that
don't exactly match really confusing. This patch introduces a struct
intended to provide more flexibility and simpler code. VFIOQuirk is
stripped to its basics, an opaque data pointer for quirk specific
data and a pointer to an array of MemoryRegions with a counter. This
still allows us to have common teardown routines, but adds much
greater flexibility to support multiple memory regions and quirk
specific data structures that are easier to maintain. The existing
VFIOQuirk is transformed into VFIOLegacyQuirk, which further patches
will eliminate entirely.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

8c4f2348

vfio/pci: Cleanup ROM blacklist quirk · 056dfcb6

由 Alex Williamson 提交于 9月 23, 2015

Create a vendor:device ID helper that we'll also use as we rework the
rest of the quirks. Re-reading the config entries, even if we get
more blacklist entries, is trivial overhead and only incurred during
device setup. There's no need to typedef the blacklist structure,
it's a static private data type used once. The elements get bumped
up to uint32_t to avoid future maintenance issues if PCI_ANY_ID gets
used for a blacklist entry (avoiding an actual hardware match). Our
test loop is also crying out to be simplified as a for loop.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

056dfcb6

A
vfio/pci: Split quirks to a separate file · c00d61d8
由 Alex Williamson 提交于 9月 23, 2015
```
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
```
c00d61d8
A
vfio/pci: Extract PCI structures to a separate header · 78f33d2b
由 Alex Williamson 提交于 9月 23, 2015
```
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
```
78f33d2b

vfio: Change polarity of our no-mmap option · 5e15d79b

由 Alex Williamson 提交于 9月 23, 2015

The default should be to allow mmap and new drivers shouldn't need to
expose an option or set it to other than the allocation default in
their initfn.  Take advantage of the experimental flag to change this
option to the correct polarity.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

5e15d79b

vfio/pci: Make interrupt bypass runtime configurable · 46746dba

由 Alex Williamson 提交于 9月 23, 2015

Tracing is more effective when we can completely disable all KVM
bypass paths. Make these runtime rather than build-time configurable.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

46746dba

vfio/pci: Rename MSI/X functions for easier tracing · 0de70dc7

由 Alex Williamson 提交于 9月 23, 2015

This allows vfio_msi* tracing.  The MSI/X interrupt tracing is also
pulled out of #ifdef DEBUG_VFIO to avoid a recompile for tracing this
path.  A few cycles to read the message is hardly anything if we're
already in QEMU.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

0de70dc7

vfio/pci: Rename INTx functions for easier tracing · 870cb6f1

由 Alex Williamson 提交于 9月 23, 2015

Rename functions and tracing callbacks so that we can trace vfio_intx*
to see all the INTx related activities.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

870cb6f1

vfio/pci: Cleanup vfio_early_setup_msix() error path · b5bd049f

由 Alex Williamson 提交于 9月 23, 2015

With the addition of the Chelsio quirk we have an error path out of
vfio_early_setup_msix() that doesn't free the allocated VFIOMSIXInfo
struct.  This doesn't introduce a leak as it still gets freed in the
vfio_put_device() path, but it's complicated and sloppy to rely on
that.  Restructure to free the allocated data on error and only link
it into the vdev on success.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Reported-by: NLaszlo Ersek <lersek@redhat.com>
Reviewed-by: NLaszlo Ersek <lersek@redhat.com>

b5bd049f

vfio/pci: Cleanup RTL8168 quirk and tracing · d451008e

由 Alex Williamson 提交于 9月 23, 2015

There's quite a bit of cleanup that can be done to the RTL8168 quirk,
as well as the tracing to prevent a spew of uninteresting accesses
for anything else the driver might choose to use the window registers
for besides the MSI-X table.  There should be no functional change,
but it's now possible to get compact and useful traces by enabling
vfio_rtl8168_quirk*, ex:

vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f000
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f000
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0xfee0100c
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f004
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f004
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f008
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f008
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x49b1
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

d451008e

23 9月, 2015 18 次提交

sPAPR: Enable EEH on VFIO PCI device only · d76548a9

由 Gavin Shan 提交于 9月 18, 2015

This checks if the PCI device retrieved from the PCI device address
is VFIO PCI device when enabling EEH functionality. If it's not
VFIO PCI device, the EEH functonality isn't enabled.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

d76548a9

sPAPR: Revert don't enable EEH on emulated PCI devices · 47445c80

由 Gavin Shan 提交于 9月 18, 2015

This reverts commit 7cb18007 ("sPAPR: Don't enable EEH on emulated
PCI devices") as rtas_ibm_set_eeh_option() isn't the right place
to check if there has the corresponding PCI device for the input
address, which can be PE address, not PCI device address.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

47445c80

ppc/spapr: Implement H_RANDOM hypercall in QEMU · 4d9392be

由 Thomas Huth 提交于 9月 17, 2015

The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

 qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

 qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
                   -device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.
Signed-off-by: NThomas Huth <thuth@redhat.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

4d9392be

ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory() · ef001f06

由 Thomas Huth 提交于 9月 15, 2015

The buffer that is allocated in spapr_populate_drconf_memory()
is used for setting both, the "ibm,dynamic-memory" and the
"ibm,associativity-lookup-arrays" property. However, only the
size of the first one is taken into account when allocating the
memory. So if the length of the second property is larger than
the length of the first one, we run into a buffer overflow here!
Fix it by taking the length of the second property into account,
too.

Fixes: "spapr: Support ibm,dynamic-reconfiguration-memory" patch
Signed-off-by: NThomas Huth <thuth@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ef001f06

spapr: Fix default NUMA node allocation for threads · 20bb648d

由 David Gibson 提交于 9月 08, 2015

At present, if guest numa nodes are requested, but the cpus in each node
are not specified, spapr just uses the default behaviour or assigning each
vcpu round-robin to nodes.

If smp_threads != 1, that will assign adjacent threads in a core to
different NUMA nodes. As well as being just weird, that's a configuration
that can't be represented in the device tree we give to the guest, which
means the guest and qemu end up with different ideas of the NUMA topology.

This patch implements mc->cpu_index_to_socket_id in the spapr code to
make sure vcpus get assigned to nodes only at the socket granularity.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

20bb648d

spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type · 0a417869

由 Bharata B Rao 提交于 8月 03, 2015

Till now memory hotplug used RTAS_LOG_V6_HP_ID_DRC_INDEX hotplug type
which meant that we generated one hotplug type of EPOW event for every
256MB (SPAPR_MEMORY_BLOCK_SIZE). This quickly overruns the kernel
rtas log buffer thus resulting in loss of memory hotplug events. Switch
to RTAS_LOG_V6_HP_ID_DRC_COUNT hotplug type for memory so that we
generate only one event per hotplug request.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

0a417869

spapr: Support hotplug by specifying DRC count · 7a36ae7a

由 Bharata B Rao 提交于 8月 03, 2015

Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows
hotplugging of DRCs by specifying the DRC count.

While we are here, rename

spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index()
spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index()

so that they match with spapr_hotplug_req_add_by_count().
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

7a36ae7a

spapr: Revert to memory@XXXX representation for non-hotplugged memory · e8f986fc

由 Bharata B Rao 提交于 8月 03, 2015

Don't represent non-hotluggable memory under drconf node. With this
we don't have to create DRC objects for them.

The effect of this patch is that we revert back to memory@XXXX representation
for all the memory specified with -m option and represent the cold
plugged memory and hot-pluggable memory under
ibm,dynamic-reconfiguration-memory.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

e8f986fc

spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA · 6663864e

由 Bharata B Rao 提交于 8月 03, 2015

When NUMA isn't configured explicitly, assume node 0 is present for
the purpose of creating ibm,associativity-lookup-arrays property
under ibm,dynamic-reconfiguration-memory DT node. This ensures that
the associativity index property is correctly updated in ibm,dynamic-memory
for the LMB that is hotplugged.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

6663864e

spapr: Provide better error message when slots exceed max allowed · 19a35c9e

由 Bharata B Rao 提交于 8月 03, 2015

Currently when user specifies more slots than allowed max of
SPAPR_MAX_RAM_SLOTS (32), we error out like this:

qemu-system-ppc64: unsupported amount of memory slots: 64

Let the user know about the max allowed slots like this:

qemu-system-ppc64: Specified number of memory slots 64 exceeds max supported 32
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

19a35c9e

spapr: Don't allow memory hotplug to memory less nodes · b556854b

由 Bharata B Rao 提交于 6月 29, 2015

Currently PowerPC kernel doesn't allow hot-adding memory to memory-less
node, but instead will silently add the memory to the first node that has
some memory. This causes two unexpected behaviours for the user.

- Memory gets hotplugged to a different node than what the user specified.
- Since pc-dimm subsystem in QEMU still thinks that memory belongs to
  memory-less node, a reboot will set things accordingly and the previously
  hotplugged memory now ends in the right node. This appears as if some
  memory moved from one node to another.

So until kernel starts supporting memory hotplug to memory-less
nodes, just prevent such attempts upfront in QEMU.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

b556854b

spapr: Memory hotplug support · c20d332a

由 Bharata B Rao 提交于 9月 01, 2015

Make use of pc-dimm infrastructure to support memory hotplug
for PowerPC.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

c20d332a

spapr: Make hash table size a factor of maxram_size · ce881f77

由 Bharata B Rao 提交于 6月 29, 2015

The hash table size is dependent on ram_size, but since with hotplug
the memory can grow till maxram_size. Hence make hash table size dependent
on maxram_size.

This allows to hotplug huge amounts of memory to the guest.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

ce881f77

spapr: Support ibm,dynamic-reconfiguration-memory · 03d196b7

由 Bharata B Rao 提交于 7月 13, 2015

Parse ibm,architecture.vec table obtained from the guest and enable
memory node configuration via ibm,dynamic-reconfiguration-memory if guest
supports it. This is in preparation to support memory hotplug for
sPAPR guests.

This changes the way memory node configuration is done. Currently all
memory nodes are built upfront. But after this patch, only memory@0 node
for RMA is built upfront. Guest kernel boots with just that and rest of
the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
are built when guest does ibm,client-architecture-support call.

Note: This patch needs a SLOF enhancement which is already part of
SLOF binary in QEMU.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

03d196b7

spapr: Add LMB DR connectors · 224245bf

由 David Gibson 提交于 8月 12, 2015

Enable memory hotplug for pseries 2.4 and add LMB DR connectors.
With memory hotplug, enforce RAM size, NUMA node memory size and maxmem
to be a multiple of SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the
granularity in which LMBs are represented and hot-added.

LMB DR connectors will be used by the memory hotplug code.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
               [spapr_drc_reset implementation]
[since this missed the 2.4 cutoff, changing to only enable for 2.5]
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

224245bf

spapr: Use QEMU limit for maximum CPUs number · 38b02bd8

由 Alexey Kardashevskiy 提交于 8月 06, 2015

sPAPR uses hard coded limit of maximum 255 supported CPUs which is
exactly the same as QEMU-wide limit which is MAX_CPUMASK_BITS and also
defined as 255.

This makes use of a global CPU number limit for the "pseries" machine.

In order to anticipate future increase of the MAX_CPUMASK_BITS
(or to help debugging large systems), this also bumps the FDT_MAX_SIZE
limit from 256K to 1M assuming that 1 CPU core needs roughly 512 bytes
in the device tree so the new limit can cover up to 2048 CPU cores.
Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

38b02bd8

spapr: Don't use QOM [*] syntax for DR connectors. · 94649d42

由 David Gibson 提交于 9月 16, 2015

The dynamic reconfiguration (hotplug) code for the pseries machine type
uses a "DR connector" QOM object for each resource it will be possible
to hotplug. Each of these is added to its owner using
object_property_add_child(owner, "dr-connector[*], ...);

That works ok, mostly, but it means that the property indices are
arbitrary, depending on the order in which the connectors are constructed.
That might line up to something useful, but it doesn't have to.

It will get worse once we add hotplug RAM support. That will add a DR
connector object for every 256MB of potential memory. So if maxmem=2T,
for example, there are 8192 objects under the same parent.

The QOM interfaces aren't really designed for this. In particular
object_property_add() with [*] has O(n^2) time complexity (in the number of
existing children): first it has a linear search through array indices to
find a free slot, each of which is attempted to a recursive call to
object_property_add() with a specific [N]. Those calls are O(n) because
there's a linear search through all properties to check for duplicates.

By using a meaningful index value, which we already know is unique we can
avoid the [*] special behaviour. That lets us reduce the total time for
creating the DR objects from O(n^3) to O(n^2).

O(n^2) is still kind of crappy, but it's enough to reduce the startup time
of qemu (with in-progress memory hotplug support) with maxmem=2T from ~20
minutes to ~4 seconds.
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Tested-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>

94649d42

spapr_drc: use RTAS return codes for methods called by RTAS · 0cb688d2

由 Michael Roth 提交于 9月 10, 2015

Certain methods in sPAPRDRConnector objects are only ever called by
RTAS and in many cases are responsible for the logic that determines
the RTAS return codes.

Rather than having a level of indirection requiring RTAS code to
re-interpret return values from such methods to determine the
appropriate return code, just pass them through directly.

This requires changing method return types to uint32_t to match the
type of values currently passed to RTAS helpers.

In the case of read accesses like drc->entity_sense() where we weren't
previously reporting any errors, just the read value, we modify the
function to return RTAS return code, and pass the read value back via
reference.
Suggested-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Suggested-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

0cb688d2