1. 29 11月, 2017 6 次提交
  2. 28 11月, 2017 1 次提交
  3. 27 11月, 2017 4 次提交
  4. 24 11月, 2017 5 次提交
  5. 23 11月, 2017 4 次提交
  6. 22 11月, 2017 11 次提交
    • D
      migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END · acab30b8
      Daniel Henrique Barboza 提交于
      When migrating a VM with 'migrate_set_capability postcopy-ram on'
      a postcopy_state is set during the process, ending up with the
      state POSTCOPY_INCOMING_END when the migration is over. This
      postcopy_state is taken into account inside ram_load to check
      how it will load the memory pages. This same ram_load is called when
      in a loadvm command.
      
      Inside ram_load, the logic to see if we're at postcopy_running state
      is:
      
      postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING
      
      postcopy_state_get() returns this enum type:
      
      typedef enum {
          POSTCOPY_INCOMING_NONE = 0,
          POSTCOPY_INCOMING_ADVISE,
          POSTCOPY_INCOMING_DISCARD,
          POSTCOPY_INCOMING_LISTENING,
          POSTCOPY_INCOMING_RUNNING,
          POSTCOPY_INCOMING_END
      } PostcopyState;
      
      In the case where ram_load is executed and postcopy_state is
      POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and
      ram_load will behave like a postcopy is in progress. This scenario isn't
      achievable in a migration but it is reproducible when executing
      savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm
      to fail with Error -22:
      
      Source:
      
      (qemu) migrate_set_capability postcopy-ram on
      (qemu) migrate tcp:127.0.0.1:4444
      
      Dest:
      
      (qemu) migrate_set_capability postcopy-ram on
      (qemu)
      ubuntu1704-intel login:
      Ubuntu 17.04 ubuntu1704-intel ttyS0
      
      ubuntu1704-intel login: (qemu)
      (qemu) savevm test1
      (qemu) loadvm test1
      Unknown combination of migration flags: 0x4 (postcopy mode)
      error while loading state for instance 0x0 of device 'ram'
      Error -22 while loading VM state
      (qemu)
      
      This patch fixes this problem by changing the existing logic for
      postcopy_advised and postcopy_running in ram_load, making them
      'false' if we're at POSTCOPY_INCOMING_END state.
      Signed-off-by: NDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
      CC: Juan Quintela <quintela@redhat.com>
      CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: NPeter Xu <peterx@redhat.com>
      Reviewed-by: NJuan Quintela <quintela@redhat.com>
      Reported-by: NBalamuruhan S <bala24@linux.vnet.ibm.com>
      Signed-off-by: NJuan Quintela <quintela@redhat.com>
      acab30b8
    • L
      ppc: fix VTB migration · 6dd836f5
      Laurent Vivier 提交于
      Migration of a system under stress (for example, with
      "stress-ng --numa 2") triggers on the destination
      some kernel watchdog messages like:
      
      NMI watchdog: BUG: soft lockup - CPU#0 stuck for 3489660870s!
      NMI watchdog: BUG: soft lockup - CPU#1 stuck for 3489660884s!
      
      This problem appears with the changes introduced by
          42043e4f spapr: clock should count only if vm is running
      
      I think this commit only triggers the problem.
      
      Kernel computes the soft lockup duration using the
      Virtual Timebase register (VTB), not using the Timebase
      Register (TBR, the one 42043e4f stops).
      
      It appears VTB is not migrated, so this patch adds it in
      the list of the SPRs to migrate, and fixes the problem.
      
      For the migration, I've tested a migration from qemu-2.8.0 and
      pseries-2.8.0 to a patched master (qemu-2.11.0-rc1). The received
      VTB is 0 (as is it not initialized by qemu-2.8.0), but the value
      seems to be ignored by KVM and a non zero VTB is used by the kernel.
      I have no explanation for that, but as the original problem appears
      only with SMP system under stress I suspect some problems in KVM
      (I think because VTB is shared by all threads of a core).
      Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      6dd836f5
    • D
      spapr: Implement bug in spapr-vty device to be compatible with PowerVM · 6c3bc244
      David Gibson 提交于
      The spapr-vty device implements the PAPR defined virtual console,
      which is also implemented by IBM's proprietary PowerVM hypervisor.
      
      PowerVM's implementation has a bug where it inserts an extra \0 after
      every \r going to the guest.  Because of that Linux's guest side
      driver has a workaround which strips \0 characters that appear
      immediately after a \r.
      
      That means that when running under qemu, sending a binary stream from
      host to guest via spapr-vty which happens to include a \r\0 sequence
      will get corrupted by that workaround.
      
      To deal with that, this patch duplicates PowerVM's bug, inserting an
      extra \0 after each \r.  Ugly, but the best option available.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NThomas Huth <thuth@redhat.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      6c3bc244
    • T
      hw/ppc/spapr: Fix virtio-scsi bootindex handling for LUNs >= 256 · bac658d1
      Thomas Huth 提交于
      LUNs >= 256 have to be encoded with the so-called "flat space
      addressing method" for virtio-scsi, where an additional bit has to
      be set. SLOF already took care of this with the following commit:
      
       https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=f72a37713fea47da
       (see https://bugzilla.redhat.com/show_bug.cgi?id=1431584 for details)
      
      But QEMU does not use this encoding yet for device tree paths
      that have to be handed over to SLOF to deal with the "bootindex"
      property, so SLOF currently fails to boot from virtio-scsi devices
      with LUNs >= 256 in the right boot order. Fix it by using the bit
      to indicate the "flat space addressing method" for LUNs >= 256.
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      bac658d1
    • A
      migration, xen: Fix block image lock issue on live migration · 5d6c599f
      Anthony PERARD 提交于
      When doing a live migration of a Xen guest with libxl, the images for
      block devices are locked by the original QEMU process, and this prevent
      the QEMU at the destination to take the lock and the migration fail.
      
      >From QEMU point of view, once the RAM of a domain is migrated, there is
      two QMP commands, "stop" then "xen-save-devices-state", at which point a
      new QEMU is spawned at the destination.
      
      Release locks in "xen-save-devices-state" so the destination can takes
      them, if it's a live migration.
      
      This patch add the "live" parameter to "xen-save-devices-state" which
      default to true so older version of libxenlight can work with newer
      version of QEMU.
      Signed-off-by: NAnthony PERARD <anthony.perard@citrix.com>
      Reviewed-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: NJuan Quintela <quintela@redhat.com>
      Signed-off-by: NJuan Quintela <quintela@redhat.com>
      5d6c599f
    • P
      Update version for v2.11.0-rc2 release · a15d835f
      Peter Maydell 提交于
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      a15d835f
    • P
      Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging · 64807cd7
      Peter Maydell 提交于
      # gpg: Signature made Tue 21 Nov 2017 17:01:33 GMT
      # gpg:                using RSA key 0xBDBE7B27C0DE3057
      # gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
      # gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
      # gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
      # Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057
      
      * remotes/cody/tags/block-pull-request:
        qemu-iotest: add test for blockjob coroutine race condition
        qemu-iotests: add option in common.qemu for mismatch only
        coroutine: abort if we try to schedule or enter a pending coroutine
        blockjob: do not allow coroutine double entry or entry-after-completion
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      64807cd7
    • J
      d975301d
    • J
      qemu-iotests: add option in common.qemu for mismatch only · a2339699
      Jeff Cody 提交于
      Add option to echo response to QMP / HMP command only on mismatch.
      
      Useful for ignore all normal responses, but catching things like
      segfaults.
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      a2339699
    • J
      coroutine: abort if we try to schedule or enter a pending coroutine · 6133b39f
      Jeff Cody 提交于
      The previous patch fixed a race condition, in which there were
      coroutines being executing doubly, or after coroutine deletion.
      
      We can detect common scenarios when this happens, and print an error
      message and abort before we corrupt memory / data, or segfault.
      
      This patch will abort if an attempt to enter a coroutine is made while
      it is currently pending execution, either in a specific AioContext bh,
      or pending execution via a timer.  It will also abort if a coroutine
      is scheduled, before a prior scheduled run has occurred.
      
      We cannot rely on the existing co->caller check for recursive re-entry
      to catch this, as the coroutine may run and exit with
      COROUTINE_TERMINATE before the scheduled coroutine executes.
      
      (This is the scenario that was occurring and fixed in the previous
      patch).
      
      This patch also re-orders the Coroutine struct elements in an attempt to
      optimize caching.
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      6133b39f
    • J
      blockjob: do not allow coroutine double entry or entry-after-completion · 4afeffc8
      Jeff Cody 提交于
      When block_job_sleep_ns() is called, the co-routine is scheduled for
      future execution.  If we allow the job to be re-entered prior to the
      scheduled time, we present a race condition in which a coroutine can be
      entered recursively, or even entered after the coroutine is deleted.
      
      The job->busy flag is used by blockjobs when a coroutine is busy
      executing. The function 'block_job_enter()' obeys the busy flag,
      and will not enter a coroutine if set.  If we sleep a job, we need to
      leave the busy flag set, so that subsequent calls to block_job_enter()
      are prevented.
      
      This changes the prior behavior of block_job_cancel() being able to
      immediately wake up and cancel a job; in practice, this should not be an
      issue, as the coroutine sleep times are generally very small, and the
      cancel will occur the next time the coroutine wakes up.
      
      This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708Signed-off-by: NJeff Cody <jcody@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      4afeffc8
  7. 21 11月, 2017 9 次提交