1. 10 4月, 2015 1 次提交
  2. 09 4月, 2015 6 次提交
  3. 08 4月, 2015 9 次提交
    • L
      fix memleak in qemuRestoreCgroupState · 7cd0cf05
      Luyao Huang 提交于
       131,088 bytes in 16 blocks are definitely lost in loss record 2,174 of 2,176
          at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
          by 0x4C2BACB: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
          by 0x52A026F: virReallocN (viralloc.c:245)
          by 0x52BFCB5: saferead_lim (virfile.c:1268)
          by 0x52C00EF: virFileReadLimFD (virfile.c:1328)
          by 0x52C019A: virFileReadAll (virfile.c:1351)
          by 0x52A5D4F: virCgroupGetValueStr (vircgroup.c:763)
          by 0x1DDA0DA3: qemuRestoreCgroupState (qemu_cgroup.c:805)
          by 0x1DDA0DA3: qemuConnectCgroup (qemu_cgroup.c:857)
          by 0x1DDB7BA1: qemuProcessReconnect (qemu_process.c:3694)
          by 0x52FD171: virThreadHelper (virthread.c:206)
          by 0x82B8DF4: start_thread (pthread_create.c:308)
          by 0x85C31AC: clone (clone.S:113)
      Signed-off-by: NLuyao Huang <lhuang@redhat.com>
      7cd0cf05
    • M
      qemuProcessHook: Call virNuma*() only when needed · ea576ee5
      Michal Privoznik 提交于
      https://bugzilla.redhat.com/show_bug.cgi?id=1198645
      
      Once upon a time, there was a little domain. And the domain was pinned
      onto a NUMA node and hasn't fully allocated its memory:
      
        <memory unit='KiB'>2355200</memory>
        <currentMemory unit='KiB'>1048576</currentMemory>
      
        <numatune>
          <memory mode='strict' nodeset='0'/>
        </numatune>
      
      Oh little me, said the domain, what will I do with so little memory.
      If I only had a few megabytes more. But the old admin noticed the
      whimpering, barely audible to untrained human ear. And good admin he
      was, he gave the domain yet more memory. But the old NUMA topology
      witch forbade to allocate more memory on the node zero. So he
      decided to allocate it on a different node:
      
      virsh # numatune little_domain --nodeset 0-1
      
      virsh # setmem little_domain 2355200
      
      The little domain was happy. For a while. Until bad, sharp teeth
      shaped creature came. Every process in the system was afraid of him.
      The OOM Killer they called him. Oh no, he's after the little domain.
      There's no escape.
      
      Do you kids know why? Because when the little domain was born, her
      father, Libvirt, called numa_set_membind(). So even if the admin
      allowed her to allocate memory from other nodes in the cgroups, the
      membind() forbid it.
      
      So what's the lesson? Libvirt should rely on cgroups, whenever
      possible and use numa_set_membind() as the last ditch effort.
      Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>
      ea576ee5
    • M
      qemu_driver: check caps after starting block job · cfcdf5ff
      Michael Chapman 提交于
      Currently we check qemuCaps before starting the block job. But qemuCaps
      isn't available on a stopped domain, which means we get a misleading
      error message in this case:
      
        # virsh domstate example
        shut off
      
        # virsh blockjob example vda
        error: unsupported configuration: block jobs not supported with this QEMU binary
      
      Move the qemuCaps check into the block job so that we are guaranteed the
      domain is running.
      Signed-off-by: NMichael Chapman <mike@very.puzzling.org>
      cfcdf5ff
    • M
      qemu_migrate: use nested job when adding NBD to cookie · 72df8314
      Michael Chapman 提交于
      qemuMigrationCookieAddNBD is usually called from within an async
      MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job.
      
      (The one exception is during the Begin phase when change protection
      isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same
      as qemuDomainObjEnterMonitor in this case.)
      
      This bug was encountered with a libvirt client that repeatedly queries
      the disk mirroring block job info during a migration. If one of these
      queries occurs just as the Perform migration cookie is baked, libvirt
      crashes.
      
      Relevant logs are as follows:
      
          6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous
      [1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"}
      [2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"}
      [3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"}
      [4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"}
          6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}'
      
      At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits
      on mon->notify. At [2] the request is written out to the monitor socket.
      At [3] qemuMonitorBlockJobInfo sends its request, and also waits on
      mon->notify. The reply from the first request is received at [4].
      However, qemuMonitorJSONIOProcessLine is not expecting this reply since
      the second request hadn't completed sending. The reply is dropped and an
      error is returned.
      
      qemuMonitorIO signals mon->notify twice during its error handling,
      waking up both of the threads waiting on it. One of them clears mon->msg
      as it exits qemuMonitorSend; the other crashes:
      
        qemuMonitorSend (mon=0x7fefdc004700, msg=<value optimized out>) at qemu/qemu_monitor.c:975
        975         while (!mon->msg->finished) {
        (gdb) print mon->msg
        $1 = (qemuMonitorMessagePtr) 0x0
      Signed-off-by: NMichael Chapman <mike@very.puzzling.org>
      72df8314
    • M
      qemu: fix race between disk mirror fail and cancel · e5d729ba
      Michael Chapman 提交于
      If a VM migration is aborted, a disk mirror may be failed by QEMU before
      libvirt has a chance to cancel it. The disk->mirrorState remains at
      _ABORT in this case, and this breaks subsequent mirrorings of that disk.
      
      We should instead check the mirrorState directly and transition to _NONE
      if it is already aborted. Do the check *after* aborting the block job in
      QEMU to avoid a race.
      Signed-off-by: NMichael Chapman <mike@very.puzzling.org>
      e5d729ba
    • M
      qemu: fix error propagation in qemuMigrationBegin · 77ddd0bb
      Michael Chapman 提交于
      If virCloseCallbacksSet fails, qemuMigrationBegin must return NULL to
      indicate an error occurred.
      Signed-off-by: NMichael Chapman <mike@very.puzzling.org>
      77ddd0bb
    • M
      qemu: fix crash in qemuProcessAutoDestroy · 7578cc17
      Michael Chapman 提交于
      The destination libvirt daemon in a migration may segfault if the client
      disconnects immediately after the migration has begun:
      
        # virsh -c qemu+tls://remote/system list --all
         Id    Name                           State
        ----------------------------------------------------
        ...
      
        # timeout --signal KILL 1 \
            virsh migrate example qemu+tls://remote/system \
              --verbose --compressed --live --auto-converge \
              --abort-on-error --unsafe --persistent \
              --undefinesource --copy-storage-all --xml example.xml
        Killed
      
        # virsh -c qemu+tls://remote/system list --all
        error: failed to connect to the hypervisor
        error: unable to connect to server at 'remote:16514': Connection refused
      
      The crash is in:
      
         1531 void
         1532 qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
         1533 {
         1534     qemuDomainObjPrivatePtr priv = obj->privateData;
         1535     qemuDomainJob job = priv->job.active;
         1536
         1537     priv->jobs_queued--;
      
      Backtrace:
      
        #0  at qemuDomainObjEndJob at qemu/qemu_domain.c:1537
        #1  in qemuDomainRemoveInactive at qemu/qemu_domain.c:2497
        #2  in qemuProcessAutoDestroy at qemu/qemu_process.c:5646
        #3  in virCloseCallbacksRun at util/virclosecallbacks.c:350
        #4  in qemuConnectClose at qemu/qemu_driver.c:1154
        ...
      
      qemuDomainRemoveInactive calls virDomainObjListRemove, which in this
      case is holding the last remaining reference to the domain.
      qemuDomainRemoveInactive then calls qemuDomainObjEndJob, but the domain
      object has been freed and poisoned by then.
      
      This patch bumps the domain's refcount until qemuDomainRemoveInactive
      has completed. We also ensure qemuProcessAutoDestroy does not return the
      domain to virCloseCallbacksRun to be unlocked in this case. There is
      similar logic in bhyveProcessAutoDestroy and lxcProcessAutoDestroy
      (which call virDomainObjListRemove directly).
      Signed-off-by: NMichael Chapman <mike@very.puzzling.org>
      7578cc17
    • M
      virQEMUDriverGetConfig: Fix memleak · 225aa802
      Michal Privoznik 提交于
      ==19015== 968 (416 direct, 552 indirect) bytes in 1 blocks are definitely lost in loss record 999 of 1,049
      ==19015==    at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
      ==19015==    by 0x52ADF14: virAllocVar (viralloc.c:560)
      ==19015==    by 0x5302FD1: virObjectNew (virobject.c:193)
      ==19015==    by 0x1DD9401E: virQEMUDriverConfigNew (qemu_conf.c:164)
      ==19015==    by 0x1DDDF65D: qemuStateInitialize (qemu_driver.c:666)
      ==19015==    by 0x53E0823: virStateInitialize (libvirt.c:777)
      ==19015==    by 0x11E067: daemonRunStateInit (libvirtd.c:905)
      ==19015==    by 0x53201AD: virThreadHelper (virthread.c:206)
      ==19015==    by 0xA1EE1F2: start_thread (in /lib64/libpthread-2.19.so)
      ==19015==    by 0xA4EFC8C: clone (in /lib64/libc-2.19.so)
      Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>
      225aa802
    • M
      qemuSetupCgroupForVcpu: Fix memleak · 9dbe6f31
      Michal Privoznik 提交于
      ==19015== 1,064 (656 direct, 408 indirect) bytes in 2 blocks are definitely lost in loss record 1,002 of 1,049
      ==19015==    at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
      ==19015==    by 0x52AD74B: virAlloc (viralloc.c:144)
      ==19015==    by 0x52B47CA: virCgroupNew (vircgroup.c:1057)
      ==19015==    by 0x52B53E5: virCgroupNewVcpu (vircgroup.c:1451)
      ==19015==    by 0x1DD85A40: qemuSetupCgroupForVcpu (qemu_cgroup.c:1013)
      ==19015==    by 0x1DDA66EA: qemuProcessStart (qemu_process.c:4844)
      ==19015==    by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
      ==19015==    by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
      ==19015==    by 0x1DDF1ACD: qemuDomainCreate (qemu_driver.c:7337)
      ==19015==    by 0x53F87EA: virDomainCreate (libvirt-domain.c:6820)
      ==19015==    by 0x12690A: remoteDispatchDomainCreate (remote_dispatch.h:3481)
      ==19015==    by 0x126827: remoteDispatchDomainCreateHelper (remote_dispatch.h:3457)
      Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>
      9dbe6f31
  4. 02 4月, 2015 17 次提交
  5. 31 3月, 2015 3 次提交
    • P
      qemu: blockjob: Synchronously update backing chain in XML on ABORT/PIVOT · 630ee5ac
      Peter Krempa 提交于
      When the synchronous pivot option is selected, libvirt would not update
      the backing chain until the job was exitted. Some applications then
      received invalid data as their job serialized first.
      
      This patch removes polling to wait for the ABORT/PIVOT job completion
      and replaces it with a condition. If a synchronous operation is
      requested the update of the XML is executed in the job of the caller of
      the synchronous request. Otherwise the monitor event callback uses a
      separate worker to update the backing chain with a new job.
      
      This is a regression since 1a92c719
      
      When the ABORT job is finished synchronously you get the following call
      stack:
       #0  qemuBlockJobEventProcess
       #1  qemuDomainBlockJobImpl
       #2  qemuDomainBlockJobAbort
       #3  virDomainBlockJobAbort
      
      While previously or while using the _ASYNC flag you'd get:
       #0  qemuBlockJobEventProcess
       #1  processBlockJobEvent
       #2  qemuProcessEventHandler
       #3  virThreadPoolWorker
      630ee5ac
    • P
      qemu: Extract internals of processBlockJobEvent into a helper · 0c4474df
      Peter Krempa 提交于
      Later on I'll be adding a condition that will allow to synchronise a
      SYNC block job abort. The approach will require this code to be called
      from two different places so it has to be extracted into a helper.
      0c4474df
    • P
      qemu: processBlockJob: Don't unlock @vm twice · 6b6c4ab8
      Peter Krempa 提交于
      Commit 1a92c719 moved code to handle block job events to a different
      function that is executed in a separate thread. The caller of
      processBlockJob handles locking and unlocking of @vm, so the we should
      not do it in the function itself.
      6b6c4ab8
  6. 30 3月, 2015 2 次提交
  7. 27 3月, 2015 1 次提交
  8. 26 3月, 2015 1 次提交