1. 19 3月, 2017 1 次提交
    • P
      qemu-ga: obey LISTEN_PID when using systemd socket activation · 53fabd4b
      Paolo Bonzini 提交于
      qemu-ga's socket activation support was not obeying the LISTEN_PID
      environment variable, which avoids that a process uses a socket-activation
      file descriptor meant for its parent.
      
      Mess can for example ensue if a process forks a children before consuming
      the socket-activation file descriptor and therefore setting O_CLOEXEC
      on it.
      
      Luckily, qemu-nbd also got socket activation code, and its copy does
      support LISTEN_PID.  Some extra fixups are needed to ensure that the
      code can be used for both, but that's what this patch does.  The
      main change is to replace get_listen_fds's "consume" argument with
      the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code.
      
      Cc: "Richard W.M. Jones" <rjones@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NDaniel P. Berrange <berrange@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      53fabd4b
  2. 18 3月, 2017 1 次提交
  3. 17 3月, 2017 1 次提交
  4. 15 3月, 2017 1 次提交
  5. 14 3月, 2017 6 次提交
    • P
      icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread · 6b8f0187
      Paolo Bonzini 提交于
      icount has become much slower after tcg_cpu_exec has stopped
      using the BQL.  There is also a latent bug that is masked by
      the slowness.
      
      The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
      timer now has to wake up the I/O thread and wait for it.  The rendez-vous
      is mediated by the BQL QemuMutex:
      
      - handle_icount_deadline wakes up the I/O thread with BQL taken
      - the I/O thread wakes up and waits on the BQL
      - the VCPU thread releases the BQL a little later
      - the I/O thread raises an interrupt, which calls qemu_cpu_kick
      - the VCPU thread notices the interrupt, takes the BQL to
        process it and waits on it
      
      All this back and forth is extremely expensive, causing a 6 to 8-fold
      slowdown when icount is turned on.
      
      One may think that the issue is that the VCPU thread is too dependent
      on the BQL, but then the latent bug comes in.  I first tried removing
      the BQL completely from the x86 cpu_exec, only to see everything break.
      The only way to fix it (and make everything slow again) was to add a dummy
      BQL lock/unlock pair.
      
      This is because in -icount mode you really have to process the events
      before the CPU restarts executing the next instruction.  Therefore, this
      series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
      the vCPU thread when running in icount mode.
      
      The required changes include:
      
      - make the timer notification callback wake up TCG's single vCPU thread
        when run from another thread.  By using async_run_on_cpu, the callback
        can override all_cpu_threads_idle() when the CPU is halted.
      
      - move handle_icount_deadline after qemu_tcg_wait_io_event, so that
        the timer notification callback is invoked after the dummy work item
        wakes up the vCPU thread
      
      - make handle_icount_deadline run the timers instead of just waking the
        I/O thread.
      
      - stop processing the timers in the main loop
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6b8f0187
    • P
      cpus: define QEMUTimerListNotifyCB for QEMU system emulation · 3f53bc61
      Paolo Bonzini 提交于
      There is no change for now, because the callback just invokes
      qemu_notify_event.
      Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3f53bc61
    • P
      qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h · d2528bdc
      Paolo Bonzini 提交于
      This dependency is the wrong way, and we will need util/qemu-timer.h from
      sysemu/cpus.h in the next patch.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d2528bdc
    • P
      qemu-timer: fix off-by-one · 33bef0b9
      Paolo Bonzini 提交于
      If the first timer is exactly at the current value of the clock, the
      deadline is met and the timer should fire.  This fixes itself on the next
      iteration of the loop without icount; with icount, however, execution
      of instructions will stop exactly at the deadline and won't proceed.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      33bef0b9
    • S
      util: Removed unneeded header from path.c · bd5d983f
      Suramya Shah 提交于
      Signed-off-by: NSuramya Shah <shah.suramya@gmail.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20170310163948.7567-1-shah.suramya@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bd5d983f
    • J
      mem-prealloc: reduce large guest start-up and migration time. · 1e356fc1
      Jitendra Kolhe 提交于
      Using "-mem-prealloc" option for a large guest leads to higher guest
      start-up and migration time. This is because with "-mem-prealloc" option
      qemu tries to map every guest page (create address translations), and
      make sure the pages are available during runtime. virsh/libvirt by
      default, seems to use "-mem-prealloc" option in case the guest is
      configured to use huge pages. The patch tries to map all guest pages
      simultaneously by spawning multiple threads. Currently limiting the
      change to QEMU library functions on POSIX compliant host only, as we are
      not sure if the problem exists on win32. Below are some stats with
      "-mem-prealloc" option for guest configured to use huge pages.
      
      ------------------------------------------------------------------------
      Idle Guest      | Start-up time | Migration time
      ------------------------------------------------------------------------
      Guest stats with 2M HugePage usage - single threaded (existing code)
      ------------------------------------------------------------------------
      64 Core - 4TB   | 54m11.796s    | 75m43.843s
      64 Core - 1TB   | 8m56.576s     | 14m29.049s
      64 Core - 256GB | 2m11.245s     | 3m26.598s
      ------------------------------------------------------------------------
      Guest stats with 2M HugePage usage - map guest pages using 8 threads
      ------------------------------------------------------------------------
      64 Core - 4TB   | 5m1.027s      | 34m10.565s
      64 Core - 1TB   | 1m10.366s     | 8m28.188s
      64 Core - 256GB | 0m19.040s     | 2m10.148s
      -----------------------------------------------------------------------
      Guest stats with 2M HugePage usage - map guest pages using 16 threads
      -----------------------------------------------------------------------
      64 Core - 4TB   | 1m58.970s     | 31m43.400s
      64 Core - 1TB   | 0m39.885s     | 7m55.289s
      64 Core - 256GB | 0m11.960s     | 2m0.135s
      -----------------------------------------------------------------------
      
      Changed in v2:
       - modify number of memset threads spawned to min(smp_cpus, 16).
       - removed 64GB memory restriction for spawning memset threads.
      
      Changed in v3:
       - limit number of threads spawned based on
         min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus)
       - implement memset thread specific siglongjmp in SIGBUS signal_handler.
      
      Changed in v4
       - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread
         as main thread no longer touches any pages.
       - simplify code my returning memset_thread_failed status from
         touch_all_pages.
      Signed-off-by: NJitendra Kolhe <jitendra.kolhe@hpe.com>
      Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1e356fc1
  6. 07 3月, 2017 3 次提交
    • M
      keyval: Support lists · 0b2c1bee
      Markus Armbruster 提交于
      Additionally permit non-negative integers as key components.  A
      dictionary's keys must either be all integers or none.  If all keys
      are integers, convert the dictionary to a list.  The set of keys must
      be [0,N].
      
      Examples:
      
      * list.1=goner,list.0=null,list.1=eins,list.2=zwei
        is equivalent to JSON [ "null", "eins", "zwei" ]
      
      * a.b.c=1,a.b.0=2
        is inconsistent: a.b.c clashes with a.b.0
      
      * list.0=null,list.2=eins,list.2=zwei
        has a hole: list.1 is missing
      
      Similar design flaw as for objects: there is no way to denote an empty
      list.  While interpreting "key absent" as empty list seems natural
      (removing a list member from the input string works when there are
      multiple ones, so why not when there's just one), it doesn't work:
      "key absent" already means "optional list absent", which isn't the
      same as "empty list present".
      
      Update the keyval object visitor to use this a.0 syntax in error
      messages rather than the usual a[0].
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Message-Id: <1488317230-26248-25-git-send-email-armbru@redhat.com>
      [Off-by-one fix squashed in, as per Kevin's review]
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      0b2c1bee
    • M
      keyval: Restrict key components to valid QAPI names · f7400483
      Markus Armbruster 提交于
      Until now, key components are separated by '.'.  This leaves little
      room for evolving the syntax, and is incompatible with the __RFQDN_
      prefix convention for downstream extensions.
      
      Since key components will be commonly used as QAPI member names by the
      QObject input visitor, we can just as well borrow the QAPI naming
      rules here: letters, digits, hyphen and period starting with a letter,
      with an optional __RFQDN_ prefix for downstream extensions.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Message-Id: <1488317230-26248-20-git-send-email-armbru@redhat.com>
      f7400483
    • M
      keyval: New keyval_parse() · d454dbe0
      Markus Armbruster 提交于
      keyval_parse() parses KEY=VALUE,... into a QDict.  Works like
      qemu_opts_parse(), except:
      
      * Returns a QDict instead of a QemuOpts (d'oh).
      
      * Supports nesting, unlike QemuOpts: a KEY is split into key
        fragments at '.' (dotted key convention; the block layer does
        something similar on top of QemuOpts).  The key fragments are QDict
        keys, and the last one's value is updated to VALUE.
      
      * Each key fragment may be up to 127 bytes long.  qemu_opts_parse()
        limits the entire key to 127 bytes.
      
      * Overlong key fragments are rejected.  qemu_opts_parse() silently
        truncates them.
      
      * Empty key fragments are rejected.  qemu_opts_parse() happily
        accepts empty keys.
      
      * It does not store the returned value.  qemu_opts_parse() stores it
        in the QemuOptsList.
      
      * It does not treat parameter "id" specially.  qemu_opts_parse()
        ignores all but the first "id", and fails when its value isn't
        id_wellformed(), or duplicate (a QemuOpts with the same ID is
        already stored).  It also screws up when a value contains ",id=".
      
      * Implied value is not supported.  qemu_opts_parse() desugars "foo" to
        "foo=on", and "nofoo" to "foo=off".
      
      * An implied key's value can't be empty, and can't contain ','.
      
      I intend to grow this into a saner replacement for QemuOpts.  It'll
      take time, though.
      
      Note: keyval_parse() provides no way to do lists, and its key syntax
      is incompatible with the __RFQDN_ prefix convention for downstream
      extensions, because it blindly splits at '.', even in __RFQDN_.  Both
      issues will be addressed later in the series.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
      d454dbe0
  7. 03 3月, 2017 2 次提交
  8. 01 3月, 2017 2 次提交
  9. 24 2月, 2017 14 次提交
  10. 21 2月, 2017 9 次提交