1. 17 12月, 2018 14 次提交
    • E
      hardfloat: implement float32/64 fused multiply-add · ccf770ba
      Emilio G. Cota 提交于
      Performance results for fp-bench:
      
      1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
      - before:
      fma-single: 74.73 MFlops
      fma-double: 74.54 MFlops
      - after:
      fma-single: 203.37 MFlops
      fma-double: 169.37 MFlops
      
      2. ARM Aarch64 A57 @ 2.4GHz
      - before:
      fma-single: 23.24 MFlops
      fma-double: 23.70 MFlops
      - after:
      fma-single: 66.14 MFlops
      fma-double: 63.10 MFlops
      
      3. IBM POWER8E @ 2.1 GHz
      - before:
      fma-single: 37.26 MFlops
      fma-double: 37.29 MFlops
      - after:
      fma-single: 48.90 MFlops
      fma-double: 59.51 MFlops
      
      Here having 3FP64 set to 1 pays off for x86_64:
      [1] 170.15 vs [0] 153.12 MFlops
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      ccf770ba
    • E
      hardfloat: implement float32/64 division · 4a629561
      Emilio G. Cota 提交于
      Performance results for fp-bench:
      
      1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
      - before:
      div-single: 34.84 MFlops
      div-double: 34.04 MFlops
      - after:
      div-single: 275.23 MFlops
      div-double: 216.38 MFlops
      
      2. ARM Aarch64 A57 @ 2.4GHz
      - before:
      div-single: 9.33 MFlops
      div-double: 9.30 MFlops
      - after:
      div-single: 51.55 MFlops
      div-double: 15.09 MFlops
      
      3. IBM POWER8E @ 2.1 GHz
      - before:
      div-single: 25.65 MFlops
      div-double: 24.91 MFlops
      - after:
      div-single: 96.83 MFlops
      div-double: 31.01 MFlops
      
      Here setting 2FP64_USE_FP to 1 pays off for x86_64:
      [1] 215.97 vs [0] 62.15 MFlops
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      4a629561
    • E
      hardfloat: implement float32/64 multiplication · 2dfabc86
      Emilio G. Cota 提交于
      Performance results for fp-bench:
      
      1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
      - before:
      mul-single: 126.91 MFlops
      mul-double: 118.28 MFlops
      - after:
      mul-single: 258.02 MFlops
      mul-double: 197.96 MFlops
      
      2. ARM Aarch64 A57 @ 2.4GHz
      - before:
      mul-single: 37.42 MFlops
      mul-double: 38.77 MFlops
      - after:
      mul-single: 73.41 MFlops
      mul-double: 76.93 MFlops
      
      3. IBM POWER8E @ 2.1 GHz
      - before:
      mul-single: 58.40 MFlops
      mul-double: 59.33 MFlops
      - after:
      mul-single: 60.25 MFlops
      mul-double: 94.79 MFlops
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      2dfabc86
    • E
      hardfloat: implement float32/64 addition and subtraction · 1b615d48
      Emilio G. Cota 提交于
      Performance results (single and double precision) for fp-bench:
      
      1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
      - before:
      add-single: 135.07 MFlops
      add-double: 131.60 MFlops
      sub-single: 130.04 MFlops
      sub-double: 133.01 MFlops
      - after:
      add-single: 443.04 MFlops
      add-double: 301.95 MFlops
      sub-single: 411.36 MFlops
      sub-double: 293.15 MFlops
      
      2. ARM Aarch64 A57 @ 2.4GHz
      - before:
      add-single: 44.79 MFlops
      add-double: 49.20 MFlops
      sub-single: 44.55 MFlops
      sub-double: 49.06 MFlops
      - after:
      add-single: 93.28 MFlops
      add-double: 88.27 MFlops
      sub-single: 91.47 MFlops
      sub-double: 88.27 MFlops
      
      3. IBM POWER8E @ 2.1 GHz
      - before:
      add-single: 72.59 MFlops
      add-double: 72.27 MFlops
      sub-single: 75.33 MFlops
      sub-double: 70.54 MFlops
      - after:
      add-single: 112.95 MFlops
      add-double: 201.11 MFlops
      sub-single: 116.80 MFlops
      sub-double: 188.72 MFlops
      
      Note that the IBM and ARM machines benefit from having
      HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance
      can suffer significantly:
      - IBM Power8:
      add-single: [1] 54.94 vs [0] 116.37 MFlops
      add-double: [1] 58.92 vs [0] 201.44 MFlops
      - Aarch64 A57:
      add-single: [1] 80.72 vs [0] 93.24 MFlops
      add-double: [1] 82.10 vs [0] 88.18 MFlops
      
      On the Intel machine, having 2F64 set to 1 pays off, but it
      doesn't for 2F32:
      - Intel i7-6700K:
      add-single: [1] 285.79 vs [0] 426.70 MFlops
      add-double: [1] 302.15 vs [0] 278.82 MFlops
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      1b615d48
    • E
      fpu: introduce hardfloat · a94b7839
      Emilio G. Cota 提交于
      The appended paves the way for leveraging the host FPU for a subset
      of guest FP operations. For most guest workloads (e.g. FP flags
      aren't ever cleared, inexact occurs often and rounding is set to the
      default [to nearest]) this will yield sizable performance speedups.
      
      The approach followed here avoids checking the FP exception flags register.
      See the added comment for details.
      
      This assumes that QEMU is running on an IEEE754-compliant FPU and
      that the rounding is set to the default (to nearest). The
      implementation-dependent specifics of the FPU should not matter; things
      like tininess detection and snan representation are still dealt with in
      soft-fp. However, this approach will break on most hosts if we compile
      QEMU with flags that break IEEE compatibility. There is no way to detect
      all of these flags at compilation time, but at least we check for
      -ffast-math (which defines __FAST_MATH__) and disable hardfloat
      (plus emit a #warning) when it is set.
      
      This patch just adds common code. Some operations will be migrated
      to hardfloat in subsequent patches to ease bisection.
      
      Note: some architectures (at least PPC, there might be others) clear
      the status flags passed to softfloat before most FP operations. This
      precludes the use of hardfloat, so to avoid introducing a performance
      regression for those targets, we add a flag to disable hardfloat.
      In the long run though it would be good to fix the targets so that
      at least the inexact flag passed to softfloat is indeed sticky.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      a94b7839
    • E
      tests/fp: add fp-bench · 25f539f3
      Emilio G. Cota 提交于
      These microbenchmarks will allow us to measure the performance impact of
      FP emulation optimizations. Note that we can measure both directly the impact
      on the softfloat functions (with "-t soft"), or the impact on an
      emulated workload (call with "-t host" and run under qemu user-mode).
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      25f539f3
    • E
      softfloat: add float{32,64}_is_zero_or_normal · 315df0d1
      Emilio G. Cota 提交于
      These will gain some users very soon.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      315df0d1
    • E
      softfloat: rename canonicalize to sf_canonicalize · f9943c7f
      Emilio G. Cota 提交于
      glibc >= 2.25 defines canonicalize in commit eaf5ad0
      (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26).
      
      Given that we'll be including <math.h> soon, prepare
      for this by prefixing our canonicalize() with sf_ to avoid
      clashing with the libc's canonicalize().
      Reported-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
      Tested-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      f9943c7f
    • E
    • E
      softfloat: add float{32,64}_is_{de,}normal · 588e6dfd
      Emilio G. Cota 提交于
      This paves the way for upcoming work.
      Reviewed-by: NBastian Koppelmann <kbastian@mail.uni-paderborn.de>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      588e6dfd
    • E
      fp-test: pick TARGET_ARM to get its specialization · 6c49b06d
      Emilio G. Cota 提交于
      This gets rid of the muladd errors due to not raising the invalid flag.
      
      - Before:
      Errors found in f64_mulAdd, rounding near_even, tininess before rounding:
      +000.0000000000000  +7FF.0000000000000  +7FF.FFFFFFFFFFFFF
              => +7FF.FFFFFFFFFFFFF .....  expected -7FF.FFFFFFFFFFFFF v....
      [...]
      
      - After:
      In 6133248 tests, no errors found in f64_mulAdd, rounding near_even, tininess before rounding.
      [...]
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Tested-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      6c49b06d
    • A
      MAINTAINERS: update status of FPU emulation · 0636e4d8
      Alex Bennée 提交于
      Given I've spent a fair amount of time around this code now I'm
      putting myself forward as a maintainer. Also given that the code has
      been extensively re-written and has testing and new incoming features
      it is probably more than just Odd Fixes.
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      0636e4d8
    • A
      contrib: add a basic gitdm config · 2f28271d
      Alex Bennée 提交于
      This is a QEMU specific version of a gitdm config for generating
      reports on the contributor base of the project. I've added enough
      group maps and domain aliases to ensure the current top ten is as
      reflective as it can be. As of this commit running:
      
        git log --numstat --since "Last Year" | gitdm -n -l 10
      
      Reports:
      
        Top changeset contributors by employer
        Red Hat                   3172 (44.3%)
        Linaro                    1153 (16.1%)
        (None)                     549 (7.7%)
        IBM                        348 (4.9%)
        Academics (various)        170 (2.4%)
        Virtuozzo                  168 (2.3%)
        Wave Computing             118 (1.6%)
        Xilinx                     102 (1.4%)
        Igalia                      93 (1.3%)
        Cadence Design Systems      88 (1.2%)
      
        Top lines changed by employer
        Red Hat                   144092 (28.1%)
        Cadence Design Systems    126554 (24.6%)
        Linaro                    77480 (15.1%)
        Wave Computing            33134 (6.5%)
        SiFive                    14392 (2.8%)
        IBM                       12219 (2.4%)
        (None)                    11948 (2.3%)
        Academics (various)       10447 (2.0%)
        Virtuozzo                 10445 (2.0%)
        CodeWeavers               9179 (1.8%)
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NDaniel P. Berrangé <berrange@redhat.com>
      Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NAleksandar Markovic <amarkovic@wavecomp.com>
      2f28271d
    • P
      Merge remote-tracking branch 'remotes/pmaydell/tags/pull-misc-20181214' into staging · b019f5e5
      Peter Maydell 提交于
      miscellaneous patches:
       * checkpatch.pl: Enforce multiline comment syntax
       * Rename cpu_physical_memory_write_rom() to address_space_write_rom()
       * disas, monitor, elf_ops: Use address_space_read() to read memory
       * Remove load_image() in favour of load_image_size()
       * Fix some minor memory leaks in arm boards/devices
       * virt: fix broken indentation
      
      # gpg: Signature made Fri 14 Dec 2018 14:41:20 GMT
      # gpg:                using RSA key 3C2525ED14360CDE
      # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
      # gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
      # gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
      # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE
      
      * remotes/pmaydell/tags/pull-misc-20181214: (22 commits)
        virt: Fix broken indentation
        target/arm: Create timers in realize, not init
        tests/test-arm-mptimer: Don't leak string memory
        hw/sd/sdhci: Don't leak memory region in sdhci_sysbus_realize()
        hw/arm/mps2-tz.c: Free mscname string in make_dma()
        target/arm: Free name string in ARMCPRegInfo hashtable entries
        include/hw/loader.h: Document load_image_size()
        hw/core/loader.c: Remove load_image()
        device_tree.c: Don't use load_image()
        hw/block/tc58128.c: Don't use load_image()
        hw/i386/multiboot.c: Don't use load_image()
        hw/i386/pc.c: Don't use load_image()
        hw/pci/pci.c: Don't use load_image()
        hw/smbios/smbios.c: Don't use load_image()
        hw/ppc/ppc405_boards: Don't use load_image()
        hw/ppc/mac_newworld, mac_oldworld: Don't use load_image()
        elf_ops.h: Use address_space_write() to write memory
        monitor: Use address_space_read() to read memory
        disas.c: Use address_space_read() to read memory
        Rename cpu_physical_memory_write_rom() to address_space_write_rom()
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      b019f5e5
  2. 16 12月, 2018 3 次提交
    • P
      Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging · 58b1f0f2
      Peter Maydell 提交于
      Block layer patches:
      
      - qcow2: Decompression worker threads
      - dmg: lzfse compression support
      - file-posix: Simplify delegation to worker thread
      - Don't pass flags to bdrv_reopen_queue()
      - iotests: make 235 work on s390 (and others)
      
      # gpg: Signature made Fri 14 Dec 2018 10:55:09 GMT
      # gpg:                using RSA key 7F09B272C88F2FD6
      # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
      # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6
      
      * remotes/kevin/tags/for-upstream: (42 commits)
        block/mirror: add missing coroutine_fn annotations
        iotests: make 235 work on s390 (and others)
        block: Assert that flags are up-to-date in bdrv_reopen_prepare()
        block: Remove assertions from update_flags_from_options()
        block: Stop passing flags to bdrv_reopen_queue_child()
        block: Remove flags parameter from bdrv_reopen_queue()
        block: Clean up reopen_backing_file() in block/replication.c
        qemu-io: Put flag changes in the options QDict in reopen_f()
        block: Drop bdrv_reopen()
        block: Use bdrv_reopen_set_read_only() in the mirror driver
        block: Use bdrv_reopen_set_read_only() in external_snapshot_commit()
        block: Use bdrv_reopen_set_read_only() in qmp_change_backing_file()
        block: Use bdrv_reopen_set_read_only() in stream_start/complete()
        block: Use bdrv_reopen_set_read_only() in bdrv_commit()
        block: Use bdrv_reopen_set_read_only() in commit_start/complete()
        block: Use bdrv_reopen_set_read_only() in bdrv_backing_update_filename()
        block: Add bdrv_reopen_set_read_only()
        file-posix: Avoid aio_worker() for QEMU_AIO_IOCTL
        file-posix: Switch to .bdrv_co_ioctl
        file-posix: Remove paio_submit_co()
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      58b1f0f2
    • P
      Merge remote-tracking branch 'remotes/kraxel/tags/usb-20181214-pull-request' into staging · 3866e6be
      Peter Maydell 提交于
      usb: fixes for mtp, ehci, usb-host and pvusb (xen).
      
      # gpg: Signature made Fri 14 Dec 2018 10:38:33 GMT
      # gpg:                using RSA key 4CB6D8EED3E87138
      # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
      # gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
      # gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
      # Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138
      
      * remotes/kraxel/tags/usb-20181214-pull-request:
        usb-mtp: Limit filename to object information size
        usb-mtp: use O_NOFOLLOW and O_CLOEXEC.
        ehci: fix fetch qtd race
        usb-host: reset and close libusb_device_handle before qemu exit
        pvusb: set max grants only in initialise
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      3866e6be
    • P
      Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2018-12-13-v2' into staging · 81781be3
      Peter Maydell 提交于
      QAPI patches for 2018-12-13
      
      # gpg: Signature made Fri 14 Dec 2018 05:53:51 GMT
      # gpg:                using RSA key 3870B400EB918653
      # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
      # gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
      # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653
      
      * remotes/armbru/tags/pull-qapi-2018-12-13-v2: (32 commits)
        qapi: add conditions to REPLICATION type/commands on the schema
        qapi: add more conditions to SPICE
        qapi: add condition to variants documentation
        qapi: add 'If:' condition to struct members documentation
        qapi: add 'If:' condition to enum values documentation
        qapi: Add #if conditions to generated code members
        qapi: add 'if' to alternate members
        qapi: add 'if' to union members
        qapi: Add 'if' to implicit struct members
        qapi: add a dictionary form for TYPE
        qapi-events: add 'if' condition to implicit event enum
        qapi: add 'if' to enum members
        qapi: add a dictionary form with 'name' key for enum members
        qapi: improve reporting of unknown or missing keys
        qapi: factor out checking for keys
        tests: print enum type members more like object type members
        qapi: change enum visitor and gen_enum* to take QAPISchemaMember
        qapi: Do not define enumeration value explicitly
        qapi: break long lines at 'data' member
        qapi: rename QAPISchemaEnumType.values to .members
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      81781be3
  3. 15 12月, 2018 2 次提交
    • P
      Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging · d058a37a
      Peter Maydell 提交于
      Most notable change in this PR is the full removal of the "handle" fsdev
      backend.
      
      # gpg: Signature made Wed 12 Dec 2018 13:20:42 GMT
      # gpg:                using RSA key 71D4D5E5822F73D6
      # gpg: Good signature from "Greg Kurz <groug@kaod.org>"
      # gpg:                 aka "Gregory Kurz <gregory.kurz@free.fr>"
      # gpg:                 aka "[jpeg image of size 3330]"
      # Primary key fingerprint: B482 8BAF 9431 40CE F2A3  4910 71D4 D5E5 822F 73D6
      
      * remotes/gkurz/tags/for-upstream:
        9p: remove support for the "handle" backend
        xen/9pfs: use g_new(T, n) instead of g_malloc(sizeof(T) * n)
        9p: use g_new(T, n) instead of g_malloc(sizeof(T) * n)
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      d058a37a
    • P
      Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20181213' into staging · 110b1a8c
      Peter Maydell 提交于
      target-arm queue:
       * Convert various devices from sysbus init to instance_init
       * Remove the now unused sysbus init support entirely
       * Allow AArch64 processors to boot from a kernel placed over 4GB
       * hw: arm: musicpal: drop TYPE_WM8750 in object_property_set_link()
       * versal: minor fixes to virtio-mmio instantation
       * arm: Implement the ARMv8.1-HPD extension
       * arm: Implement the ARMv8.2-AA32HPD extension
       * arm: Implement the ARMv8.1-LOR extension (as the trivial
         "no limited ordering regions provided" minimum)
      
      # gpg: Signature made Thu 13 Dec 2018 14:52:25 GMT
      # gpg:                using RSA key 3C2525ED14360CDE
      # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
      # gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
      # gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
      # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE
      
      * remotes/pmaydell/tags/pull-target-arm-20181213: (37 commits)
        target/arm: Implement the ARMv8.1-LOR extension
        target/arm: Use arm_hcr_el2_eff more places
        target/arm: Introduce arm_hcr_el2_eff
        target/arm: Implement the ARMv8.2-AA32HPD extension
        target/arm: Implement the ARMv8.1-HPD extension
        target/arm: Tidy scr_write
        target/arm: Fix HCR_EL2.TGE check in arm_phys_excp_target_el
        target/arm: Add SCR_EL3 bits up to ARMv8.5
        target/arm: Add HCR_EL2 bits up to ARMv8.5
        target/arm: Move id_aa64mmfr* to ARMISARegisters
        hw/arm: versal: Correct the nr of IRQs to 192
        hw/arm: versal: Use IRQs 111 - 118 for virtio-mmio
        hw/arm: versal: Reduce number of virtio-mmio instances
        hw/arm: versal: Remove bogus virtio-mmio creation
        core/sysbus: remove the SysBusDeviceClass::init path
        xen_backend: remove xen_sysdev_init() function
        usb/tusb6010: Convert sysbus init function to realize function
        timer/puv3_ost: Convert sysbus init function to realize function
        timer/grlib_gptimer: Convert sysbus init function to realize function
        timer/etraxfs_timer: Convert sysbus init function to realize function
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      110b1a8c
  4. 14 12月, 2018 21 次提交