• D
    tests: introduce a framework for testing migration performance · 409437e1
    Daniel P. Berrange 提交于
    This introduces a moderately general purpose framework for
    testing performance of migration.
    
    The initial guest workload is provided by the included 'stress'
    program, which is configured to spawn one thread per guest CPU
    and run a maximally memory intensive workload. It will loop
    over GB of memory, xor'ing each byte with data from a 4k array
    of random bytes. This ensures heavy read and write load across
    all of guest memory to stress the migration performance. While
    running the 'stress' program will record how long it takes to
    xor each GB of memory and print this data for later reporting.
    
    The test engine will spawn a pair of QEMU processes, either on
    the same host, or with the target on a remote host via ssh,
    using the host kernel and a custom initrd built with 'stress'
    as the /init binary. Kernel command line args are set to ensure
    a fast kernel boot time (< 1 second) between launching QEMU and
    the stress program starting execution.
    
    None the less, the test engine will initially wait N seconds for
    the guest workload to stablize, before starting the migration
    operation. When migration is running, the engine will use pause,
    post-copy, autoconverge, xbzrle compression and multithread
    compression features, as well as downtime & bandwidth tuning
    to encourage completion. If migration completes, the test engine
    will wait N seconds again for the guest workooad to stablize on
    the target host. If migration does not complete after a preset
    number of iterations, it will be aborted.
    
    While the QEMU process is running on the source host, the test
    engine will sample the host CPU usage of QEMU as a whole, and
    each vCPU thread. While migration is running, it will record
    all the stats reported by 'query-migration'. Finally, it will
    capture the output of the stress program running in the guest.
    
    All the data produced from a single test execution is recorded
    in a structured JSON file. A separate program is then able to
    create interactive charts using the "plotly" python + javascript
    libraries, showing the characteristics of the migration.
    
    The data output provides visualization of the effect on guest
    vCPU workloads from the migration process, the corresponding
    vCPU utilization on the host, and the overall CPU hit from
    QEMU on the host. This is correlated from statistics from the
    migration process, such as downtime, vCPU throttling and iteration
    number.
    
    While the tests can be run individually with arbitrary parameters,
    there is also a facility for producing batch reports for a number
    of pre-defined scenarios / comparisons, in order to be able to
    get standardized results across different hardware configurations
    (eg TCP vs RDMA, or comparing different VCPU counts / memory
    sizes, etc).
    
    To use this, first you must build the initrd image
    
     $ make tests/migration/initrd-stress.img
    
    To run a a one-shot test with all default parameters
    
     $ ./tests/migration/guestperf.py > result.json
    
    This has many command line args for varying its behaviour.
    For example, to increase the RAM size and CPU count and
    bind it to specific host NUMA nodes
    
     $ ./tests/migration/guestperf.py \
           --mem 4 --cpus 2 \
           --src-mem-bind 0 --src-cpu-bind 0,1 \
           --dst-mem-bind 1 --dst-cpu-bind 2,3 \
           > result.json
    
    Using mem + cpu binding is strongly recommended on NUMA
    machines, otherwise the guest performance results will
    vary wildly between runs of the test due to lucky/unlucky
    NUMA placement, making sensible data analysis impossible.
    
    To make it run across separate hosts:
    
     $ ./tests/migration/guestperf.py \
           --dst-host somehostname > result.json
    
    To request that post-copy is enabled, with switchover
    after 5 iterations
    
     $ ./tests/migration/guestperf.py \
           --post-copy --post-copy-iters 5 > result.json
    
    Once a result.json file is created, a graph of the data
    can be generated, showing guest workload performance per
    thread and the migration iteration points:
    
     $ ./tests/migration/guestperf-plot.py --output result.html \
            --migration-iters --split-guest-cpu result.json
    
    To further include host vCPU utilization and overall QEMU
    utilization
    
     $ ./tests/migration/guestperf-plot.py --output result.html \
            --migration-iters --split-guest-cpu \
    	--qemu-cpu --vcpu-cpu result.json
    
    NB, the 'guestperf-plot.py' command requires that you have
    the plotly python library installed. eg you must do
    
     $ pip install --user  plotly
    
    Viewing the result.html file requires that you have the
    plotly.min.js file in the same directory as the HTML
    output. This js file is installed as part of the plotly
    python library, so can be found in
    
      $HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
    
    The guestperf-plot.py program can accept multiple json files
    to plot, enabling results from different configurations to
    be compared.
    
    Finally, to run the entire standardized set of comparisons
    
      $ ./tests/migration/guestperf-batch.py \
           --dst-host somehost \
           --mem 4 --cpus 2 \
           --src-mem-bind 0 --src-cpu-bind 0,1 \
           --dst-mem-bind 1 --dst-cpu-bind 2,3
           --output tcp-somehost-4gb-2cpu
    
    will store JSON files from all scenarios in the directory
    named tcp-somehost-4gb-2cpu
    Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
    Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
    Signed-off-by: NAmit Shah <amit.shah@redhat.com>
    409437e1
Makefile.include 35.4 KB