提交 · 5ddda0b8f0c07c8082c87d248c8eb23f43fd44a1 · openeuler / qemu

16 6月, 2016 6 次提交

block: Make .bdrv_load_vmstate() vectored · 5ddda0b8

由 Kevin Wolf 提交于 6月 09, 2016

This brings it in line with .bdrv_save_vmstate().
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

5ddda0b8

block: Introduce bdrv_preadv() · f1e84741

由 Kevin Wolf 提交于 6月 09, 2016

We already have a byte-based bdrv_pwritev(), but the read counterpart
was still missing. This commit adds it.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

f1e84741

block: Byte-based bdrv_co_do_copy_on_readv() · 244483e6

由 Kevin Wolf 提交于 6月 02, 2016

In a first step to convert the common I/O path to work on bytes rather
than sectors, this converts the copy-on-read logic that is used by
bdrv_aligned_preadv().
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

244483e6

block: Assert that flags are in range · fa166538

由 Eric Blake 提交于 6月 13, 2016

Add a new BDRV_REQ_MASK constant, and use it to make sure that
caller flags are always valid.

Tested with 'make check' and with qemu-iotests on both '-raw'
and '-qcow2'; the only failure turned up was fixed in the
previous commit.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

fa166538

migration: rename functions to starting migrations · 22724f49

由 Daniel P. Berrange 提交于 6月 01, 2016

Apply the following renames for starting incoming migration:

 process_incoming_migration -> migration_fd_process_incoming
 migration_set_incoming_channel -> migration_channel_process_incoming
 migration_tls_set_incoming_channel -> migration_tls_channel_process_incoming

and for starting outgoing migration:

 migration_set_outgoing_channel -> migration_channel_connect
 migration_tls_set_outgoing_channel -> migration_tls_channel_connect
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-id: 1464776234-9910-3-git-send-email-berrange@redhat.com
Message-Id: <1464776234-9910-3-git-send-email-berrange@redhat.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>

22724f49

Postcopy: Add stats on page requests · d3bf5418

由 Dr. David Alan Gilbert 提交于 6月 13, 2016

On the source, add a count of page requests received from the
destination.
Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NDenis V. Lunev <den@openvz.org>
Message-id: 1465816605-29488-4-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-4-git-send-email-dgilbert@redhat.com>
Signed-off-by: NAmit Shah <amit.shah@redhat.com>

d3bf5418

15 6月, 2016 2 次提交

target-i386: Implement CPUID[0xB] (Extended Topology Enumeration) · 5232d00a

由 Radim Krčmář 提交于 5月 12, 2016

I looked at a dozen Intel CPU that have this CPUID and all of them
always had Core offset as 1 (a wasted bit when hyperthreading is
disabled) and Package offset at least 4 (wasted bits at <= 4 cores).

QEMU uses more compact IDs and it doesn't make much sense to change it
now. I keep the SMT and Core sub-leaves even if there is just one
thread/core; it makes the code simpler and there should be no harm.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

5232d00a

pc: Add 2.7 machine · d86c1451

由 Igor Mammedov 提交于 5月 17, 2016

Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

d86c1451

14 6月, 2016 15 次提交

arm: xlnx-zynqmp: Add xlnx-dp and xlnx-dpdma · b93dbcdd

由 KONRAD Frederic 提交于 6月 14, 2016

This adds the DP and the DPDMA to the Zynq MP platform.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: NPeter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Tested-By: NHyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-10-git-send-email-fred.konrad@greensocs.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

b93dbcdd

introduce xlnx-dp · 58ac482a

由 KONRAD Frederic 提交于 6月 14, 2016

This is the implementation of the DisplayPort.
It has an aux-bus to access dpcd and edid.

Graphic plane is connected to the channel 3.
Video plane is connected to the channel 0.
Audio stream are connected to the channels 4 and 5.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Tested-By: NHyun Kwon <hyun.kwon@xilinx.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Message-id: 1465833014-21982-9-git-send-email-fred.konrad@greensocs.com
[PMM: fixed format strings]
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

58ac482a

introduce xlnx-dpdma · d3c6369a

由 KONRAD Frederic 提交于 6月 14, 2016

This is the implementation of the DPDMA.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Tested-By: NHyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-8-git-send-email-fred.konrad@greensocs.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

d3c6369a

hw/i2c-ddc.c: Implement DDC I2C slave · 78c71af8

由 Peter Maydell 提交于 6月 14, 2016

Implement an I2C slave which implements DDC and returns the
EDID data for an attached monitor.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Tested-by: NHyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1465833014-21982-7-git-send-email-fred.konrad@greensocs.com

  - Rebased on the current master.
  - Modified for QOM.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Tested-By: NHyun Kwon <hyun.kwon@xilinx.com>
[PMM: actually wire up the vmstate to dc->vmsd]
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

78c71af8

introduce dpcd module · e27ed1bd

由 KONRAD Frederic 提交于 6月 14, 2016

This introduces dpcd module.
It wires on a aux-bus and can be accessed by the driver to get lane-speed, etc.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Reviewed-by: NPeter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-By: NHyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-6-git-send-email-fred.konrad@greensocs.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

e27ed1bd

introduce aux-bus · 6fc7f77f

由 KONRAD Frederic 提交于 6月 14, 2016

This introduces a new bus: aux-bus.

It contains an address space for aux slaves devices and a bridge to an I2C bus
for I2C through AUX transactions.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Tested-By: NHyun Kwon <hyun.kwon@xilinx.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Message-id: 1465833014-21982-5-git-send-email-fred.konrad@greensocs.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

6fc7f77f

i2c: Factor our send() and recv() common logic · 056fca7b

由 Peter Crosthwaite 提交于 6月 14, 2016

Most of the control flow logic between send and recv (error checking
etc) is the same. Factor this out into a common send_recv() API.
This is then usable by clients, where the control logic for send
and receive differs only by a boolean. E.g.

if (send)
   i2c_send(...):
else
   i2c_recv(...);

becomes:

i2c_send_recv(... , send);
Signed-off-by: NPeter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1465833014-21982-4-git-send-email-fred.konrad@greensocs.com
Changes from FK:
  * Rebased on master.
  * Rebased on my i2c broadcast patch.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

056fca7b

hw/arm/virt: Add PMU node for virt machine · 01fe6b60

由 Shannon Zhao 提交于 6月 14, 2016

Add a virtual PMU device for virt machine while use PPI 7 for PMU
overflow interrupt number.
Signed-off-by: NShannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Message-id: 1465267577-1808-3-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

01fe6b60

xen: Clean up includes · b1b23e5b

由 Peter Maydell 提交于 5月 24, 2016

Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>

b1b23e5b

s390x/css: introduce property type for device ids · 06e686ea

由 Cornelia Huck 提交于 4月 01, 2016

Let's introduce a CssDevId to handle device ids of the xx.x.xxxx
type used for channel devices. This has some benefits:

- We can use them in virtio-ccw and split the validity checks for
  a channel device id in general from the constraint checking
  within the virtio-ccw scope.
- We can reuse the device id type for future non-virtio channel
  devices.

While we're at it, improve the validity checks and disallow e.g.
trailing characters.
Suggested-by: NDong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NDong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>

06e686ea

s390x/kvm: add interface for clearing IO irqs · 9eccb862

由 Halil Pasic 提交于 1月 27, 2016

According to the platform specification, under certain conditions,
pending IO interruptions have to be cleared. Let's add an interface
for that.
Signed-off-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>

9eccb862

C
linux-headers: update · ff804f15
由 Cornelia Huck 提交于 6月 07, 2016
```
Update to 4.7-rc2.
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
```
ff804f15

spapr: Ensure all LMBs are represented in ibm,dynamic-memory · d0e5a8f2

由 Bharata B Rao 提交于 6月 10, 2016

Memory hotplug can fail for some combinations of RAM and maxmem when
DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
on maximum addressable memory returned by guest and this value is currently
being calculated wrongly by the guest kernel routine memory_hotplug_max().
While there is an attempt to fix the guest kernel, this patch works
around the problem within QEMU itself.

memory_hotplug_max() routine in the guest kernel arrives at max
addressable memory by multiplying lmb-size with the lmb-count obtained
from ibm,dynamic-memory property. There are two assumptions here:

- All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
  where only hot-pluggable LMBs are present in this property.
- The memory area comprising of RAM and hotplug region is contiguous: This
  needn't be true always for PowerKVM as there can be gap between
  boot time RAM and hotplug region.

To work around this guest kernel bug, ensure that ibm,dynamic-memory
has information about all the LMBs (RMA, boot-time LMBs, future
hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
hotpluggable region).

RMA is represented separately by memory@0 node. Hence mark RMA LMBs
and also the LMBs for the gap b/n RAM and hotpluggable region as
reserved and as having no valid DRC so that these LMBs are not considered
by the guest.
Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

d0e5a8f2

macio: call dma_memory_unmap() at the end of each DMA transfer · bc9ca595

由 Mark Cave-Ayland 提交于 6月 10, 2016

This ensures that the underlying memory is marked dirty once the transfer
is complete and resolves cache coherency problems under MacOS 9.
Signed-off-by: NMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

bc9ca595

Add PowerPC AT_HWCAP2 definitions · 42bff477

由 Anton Blanchard 提交于 6月 07, 2016

We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so
add the PowerPC AT_HWCAP2 definitions.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>

42bff477

13 6月, 2016 2 次提交

crypto: aes: always rename internal symbols · c8d70e59

由 Mike Frysinger 提交于 6月 06, 2016

OpenSSL's libcrypto always defines AES symbols with the same names as
qemu's local aes code.  This is problematic when enabling at least curl
as that frequently also uses libcrypto.  It might not be noticed when
running, but if you try to statically link, everything falls down.

An example snippet:
  LINK  qemu-nbd
.../libcrypto.a(aes-x86_64.o): In function 'AES_encrypt':
(.text+0x460): multiple definition of 'AES_encrypt'
crypto/aes.o:aes.c:(.text+0x670): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_decrypt':
(.text+0x9f0): multiple definition of 'AES_decrypt'
crypto/aes.o:aes.c:(.text+0xb30): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_cbc_encrypt':
(.text+0xf90): multiple definition of 'AES_cbc_encrypt'
crypto/aes.o:aes.c:(.text+0xff0): first defined here
collect2: error: ld returned 1 exit status
.../qemu-2.6.0/rules.mak:105: recipe for target 'qemu-nbd' failed
make: *** [qemu-nbd] Error 1

The aes.h header has redefines already for FreeBSD, but go ahead and
enable that for everyone since there's no real good reason to not use
a namespace all the time.
Signed-off-by: NMike Frysinger <vapier@chromium.org>
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

c8d70e59

vl: Eliminate usb_enabled() · 4bcbe0b6

由 Eduardo Habkost 提交于 6月 08, 2016

This wrapper for machine_usb(current_machine) is not necessary,
replace all usages of usb_enabled() with machine_usb().

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-arm@nongnu.org
Cc: qemu-ppc@nongnu.org
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Reviewed-by: NMarcel Apfelbaum <marcel@redhat.com>
Reviewed-by: NThomas Huth <thuth@redhat.com>
Message-id: 1465419025-21519-3-git-send-email-ehabkost@redhat.com
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>

4bcbe0b6

12 6月, 2016 10 次提交

tb hash: track translated blocks with qht · 909eaac9

由 Emilio G. Cota 提交于 6月 08, 2016

Having a fixed-size hash table for keeping track of all translation blocks
is suboptimal: some workloads are just too big or too small to get maximum
performance from the hash table. The MRU promotion policy helps improve
performance when the hash table is a little undersized, but it cannot
make up for severely undersized hash tables.

Furthermore, frequent MRU promotions result in writes that are a scalability
bottleneck. For scalability, lookups should only perform reads, not writes.
This is not a big deal for now, but it will become one once MTTCG matures.

The appended fixes these issues by using qht as the implementation of
the TB hash table. This solution is superior to other alternatives considered,
namely:

- master: implementation in QEMU before this patchset
- xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU.
- xxhash-rcu: fixed buckets + xxhash + RCU list + MRU.
              MRU is implemented here by adding an intermediate struct
              that contains the u32 hash and a pointer to the TB; this
              allows us, on an MRU promotion, to copy said struct (that is not
              at the head), and put this new copy at the head. After a grace
              period, the original non-head struct can be eliminated, and
              after another grace period, freed.
- qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize +
                   no MRU for lookups; MRU for inserts.
The appended solution is the following:
- qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize +
                 no MRU for lookups; MRU for inserts.

The plots below compare the considered solutions. The Y axis shows the
boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis
sweeps the number of buckets (or initial number of buckets for qht-autoresize).
The plots in PNG format (and with errorbars) can be seen here:
  http://imgur.com/a/Awgnq

Each test runs 5 times, and the entire QEMU process is pinned to a
single core for repeatability of results.

                            Host: Intel Xeon E5-2690

  28 ++------------+-------------+-------------+-------------+------------++
     A*****        +             +             +             master **A*** +
  27 ++    *                                                 xxhash ##B###++
     |      A******A******                               xxhash-rcu $$C$$$ |
  26 C$$                  A******A******            qht-fixed-nomru*%%D%%%++
     D%%$$                              A******A******A*qht-dyn-mru A*E****A
  25 ++ %%$$                                          qht-dyn-nomru &&F&&&++
     B#####%                                                               |
  24 ++    #C$$$$$                                                        ++
     |      B###  $                                                        |
     |          ## C$$$$$$                                                 |
  23 ++           #       C$$$$$$                                         ++
     |             B######       C$$$$$$                                %%%D
  22 ++                  %B######       C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C
     |                    D%%%%%%B######      @E@@@@@@    %%%D%%%@@@E@@@@@@E
  21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B
     +             E@@@   F&&&   +      E@     +      F&&&   +             +
  20 ++------------+-------------+-------------+-------------+------------++
     14            16            18            20            22            24
                             log2 number of buckets

                                 Host: Intel i7-4790K

  14.5 ++------------+------------+-------------+------------+------------++
       A**           +            +             +            master **A*** +
    14 ++ **                                                 xxhash ##B###++
  13.5 ++   **                                           xxhash-rcu $$C$$$++
       |                                            qht-fixed-nomru %%D%%% |
    13 ++     A******                                   qht-dyn-mru @@E@@@++
       |             A*****A******A******             qht-dyn-nomru &&F&&& |
  12.5 C$$                               A******A******A*****A******    ***A
    12 ++ $$                                                        A***  ++
       D%%% $$                                                             |
  11.5 ++  %%                                                             ++
       B###  %C$$$$$$                                                      |
    11 ++  ## D%%%%% C$$$$$                                               ++
       |     #      %      C$$$$$$                                         |
  10.5 F&&&&&&B######D%%%%%       C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$    $$$C
    10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B
       +             F&&          D%%%%%%B######B######B#####B###@@@D%%%   +
   9.5 ++------------+------------+-------------+------------+------------++
       14            16           18            20           22            24
                              log2 number of buckets

Note that the original point before this patch series is X=15 for "master";
the little sensitivity to the increased number of buckets is due to the
poor hashing function in master.

xxhash-rcu has significant overhead due to the constant churn of allocating
and deallocating intermediate structs for implementing MRU. An alternative
would be do consider failed lookups as "maybe not there", and then
acquire the external lock (tb_lock in this case) to really confirm that
there was indeed a failed lookup. This, however, would not be enough
to implement dynamic resizing--this is more complex: see
"Resizable, Scalable, Concurrent Hash Tables via Relativistic
Programming" by Triplett, McKenney and Walpole. This solution was
discarded due to the very coarse RCU read critical sections that we have
in MTTCG; resizing requires waiting for readers after every pointer update,
and resizes require many pointer updates, so this would quickly become
prohibitive.

qht-fixed-nomru shows that MRU promotion is advisable for undersized
hash tables.

However, qht-dyn-mru shows that MRU promotion is not important if the
hash table is properly sized: there is virtually no difference in
performance between qht-dyn-nomru and qht-dyn-mru.

Before this patch, we're at X=15 on "xxhash"; after this patch, we're at
X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we
can achieve with optimum sizing of the hash table, while keeping the hash
table scalable for readers.

The improvement we get before and after this patch for booting debian jessie
with arm-softmmu is:

- Intel Xeon E5-2690: 10.5% less time
- Intel i7-4790K: 5.2% less time

We could get this same improvement _for this particular workload_ by
statically increasing the size of the hash table. But this would hurt
workloads that do not need a large hash table. The dynamic (upward)
resizing allows us to start small and enlarge the hash table as needed.

A quick note on downsizing: the table is resized back to 2**15 buckets
on every tb_flush; this makes sense because it is not guaranteed that the
table will reach the same number of TBs later on (e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: NSergey Fedorov <serge.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

909eaac9

qht: QEMU's fast, resizable and scalable Hash Table · 2e11264a

由 Emilio G. Cota 提交于 6月 08, 2016

This is a fast, scalable chained hash table with optional auto-resizing, allowing
reads that are concurrent with reads, and reads/writes that are concurrent
with writes to separate buckets.

A hash table with these features will be necessary for the scalability
of the ongoing MTTCG work; before those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-11-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

2e11264a

qdist: add module to represent frequency distributions of data · bf3afd5f

由 Emilio G. Cota 提交于 6月 08, 2016

Sometimes it is useful to have a quick histogram to represent a certain
distribution -- for example, when investigating a performance regression
in a hash table due to inadequate hashing.

The appended allows us to easily represent a distribution using Unicode
characters. Further, the data structure keeping track of the distribution
is so simple that obtaining its values for off-line processing is trivial.

Example, taking the last 10 commits to QEMU:

 Characters in commit title  Count
-----------------------------------
                         39      1
                         48      1
                         53      1
                         54      2
                         57      1
                         61      1
                         67      1
                         78      1
                         80      1
qdist_init(&dist);
qdist_inc(&dist, 39);
[...]
qdist_inc(&dist, 80);

char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);

char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-9-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

bf3afd5f

tb hash: hash phys_pc, pc, and flags with xxhash · 42bd3228

由 Emilio G. Cota 提交于 6月 08, 2016

For some workloads such as arm bootup, tb_phys_hash is performance-critical.
The is due to the high frequency of accesses to the hash table, originated
by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's.
More info:
  https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html

To dig further into this I modified an arm image booting debian jessie to
immediately shut down after boot. Analysis revealed that quite a bit of time
is unnecessarily spent in tb_phys_hash: the cause is poor hashing that
results in very uneven loading of chains in the hash table's buckets;
the longest observed chain had ~550 elements.

The appended addresses this with two changes:

1) Use xxhash as the hash table's hash function. xxhash is a fast,
   high-quality hashing function.

2) Feed the hashing function with not just tb_phys, but also pc and flags.

This improves performance over using just tb_phys for hashing, since that
resulted in some hash buckets having many TB's, while others getting very few;
with these changes, the longest observed chain on a single hash bucket is
brought down from ~550 to ~40.

Tests show that the other element checked for in tb_find_physical,
cs_base, is always a match when tb_phys+pc+flags are a match,
so hashing cs_base is wasteful. It could be that this is an ARM-only
thing, though. UPDATE:
On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote:
> The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB
> consisting of only a delay slot).
> It may well still turn out to be reasonable to ignore cs_base for hashing.

BTW, after this change the hash table should not be called "tb_hash_phys"
anymore; this is addressed later in this series.

This change gives consistent bootup time improvements. I tested two
host machines:
- Intel Xeon E5-2690: 11.6% less time
- Intel i7-4790K: 19.2% less time

Increasing the number of hash buckets yields further improvements. However,
using a larger, fixed number of buckets can degrade performance for other
workloads that do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-8-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

42bd3228

exec: add tb_hash_func5, derived from xxhash · dc8b295d

由 Emilio G. Cota 提交于 6月 08, 2016

This will be used by upcoming changes for hashing the tb hash.

Add this into a separate file to include the copyright notice from
xxhash.
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-7-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

dc8b295d

qemu-thread: add simple test-and-set spinlock · ac9a9eba

由 Guillaume Delbergue 提交于 6月 08, 2016

Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: NGuillaume Delbergue <guillaume.delbergue@greensocs.com>
[Rewritten. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[Emilio's additions: use TAS instead of atomic_xchg; emit acquire/release
 barriers; return bool from trylock; call cpu_relax() while spinning;
 optimize for uncontended locks by acquiring the lock with TAS instead
 of TATAS; add qemu_spin_locked().]
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-6-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

ac9a9eba

include/processor.h: define cpu_relax() · 462cda50

由 Emilio G. Cota 提交于 6月 08, 2016

Taken from the linux kernel.
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-5-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

462cda50

seqlock: rename write_lock/unlock to write_begin/end · 03719e44

由 Emilio G. Cota 提交于 6月 08, 2016

It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-4-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

03719e44

seqlock: remove optional mutex · ccdb3c1f

由 Emilio G. Cota 提交于 6月 08, 2016

This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-3-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

ccdb3c1f

compiler.h: add QEMU_ALIGNED() to enforce struct alignment · 911a4d22

由 Emilio G. Cota 提交于 6月 08, 2016

Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-2-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

911a4d22

09 6月, 2016 1 次提交

cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc() · 6886b980

由 Peter Maydell 提交于 5月 17, 2016

The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: NEduardo Habkost <ehabkost@redhat.com>
Acked-by: NRiku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org

6886b980

08 6月, 2016 4 次提交

block: Kill bdrv_co_write_zeroes() · c1499a5e

由 Eric Blake 提交于 6月 01, 2016

Now that all drivers have been converted to a byte interface,
we no longer need a sector interface.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

c1499a5e

block: Switch bdrv_write_zeroes() to byte interface · 74021bc4

由 Eric Blake 提交于 6月 01, 2016

Rename to bdrv_pwrite_zeroes() to let the compiler ensure we
cater to the updated semantics.  Do the same for bdrv_co_write_zeroes().
Signed-off-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

74021bc4

block: Add .bdrv_co_pwrite_zeroes() · d05aa8bb

由 Eric Blake 提交于 6月 01, 2016

Update bdrv_co_do_write_zeroes() to be byte-based, and select
between the new byte-based bdrv_co_pwrite_zeroes() or the old
bdrv_co_write_zeroes().  The next patches will convert drivers,
then remove the old interface.
Signed-off-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

d05aa8bb

block: Track write zero limits in bytes · cf081fca

由 Eric Blake 提交于 6月 01, 2016

Another step towards removing sector-based interfaces: convert
the maximum write and minimum alignment values from sectors to
bytes.  Rename the variables to let the compiler check that all
users are converted to the new semantics.

The maximum remains an int as long as BDRV_REQUEST_MAX_SECTORS
is constrained by INT_MAX (this means that we can't even
support a 2G write_zeroes, but just under it) - changing
operation lengths to unsigned or to 64-bits is a much bigger
audit, and debatable if we even want to do it (since at the
core, a 32-bit platform will still have ssize_t as its
underlying limit on write()).

Meanwhile, alignment is changed to 'uint32_t', since it makes no
sense to have an alignment larger than the maximum write, and
less painful to use an unsigned type with well-defined behavior
in bit operations than to have to worry about what happens if
a driver mistakenly supplies a negative alignment.

Add an assert that no one was trying to use sectors to get a
write zeroes larger than 2G, and therefore that a later conversion
to bytes won't be impacted by keeping the limit at 32 bits.
Signed-off-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

cf081fca