提交 · ee591240c2074ab905a4f9bf0db2bb40fc10bc85 · openeuler / libvirt

15 4月, 2015 1 次提交
- P
  
  qemu: monitor: Ensure that qemuMonitorSetLink is called with non-null name · ee591240
  由 Peter Krempa 提交于 4月 14, 2015
  
  ee591240
14 4月, 2015 1 次提交

qemu: Enforce WWN to be unique among VM's disks · 714b38cb

由 Peter Krempa 提交于 4月 07, 2015

Operating systems use the identifier to name the disks. As the name
suggests the ID should be unique.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1208009

714b38cb

08 4月, 2015 2 次提交

qemuProcessHook: Call virNuma*() only when needed · ea576ee5

由 Michal Privoznik 提交于 3月 27, 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1198645

Once upon a time, there was a little domain. And the domain was pinned
onto a NUMA node and hasn't fully allocated its memory:

  <memory unit='KiB'>2355200</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>

  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>

Oh little me, said the domain, what will I do with so little memory.
If I only had a few megabytes more. But the old admin noticed the
whimpering, barely audible to untrained human ear. And good admin he
was, he gave the domain yet more memory. But the old NUMA topology
witch forbade to allocate more memory on the node zero. So he
decided to allocate it on a different node:

virsh # numatune little_domain --nodeset 0-1

virsh # setmem little_domain 2355200

The little domain was happy. For a while. Until bad, sharp teeth
shaped creature came. Every process in the system was afraid of him.
The OOM Killer they called him. Oh no, he's after the little domain.
There's no escape.

Do you kids know why? Because when the little domain was born, her
father, Libvirt, called numa_set_membind(). So even if the admin
allowed her to allocate memory from other nodes in the cgroups, the
membind() forbid it.

So what's the lesson? Libvirt should rely on cgroups, whenever
possible and use numa_set_membind() as the last ditch effort.
Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>

ea576ee5

qemu: fix crash in qemuProcessAutoDestroy · 7578cc17

由 Michael Chapman 提交于 3月 30, 2015

The destination libvirt daemon in a migration may segfault if the client
disconnects immediately after the migration has begun:

  # virsh -c qemu+tls://remote/system list --all
   Id    Name                           State
  ----------------------------------------------------
  ...

  # timeout --signal KILL 1 \
      virsh migrate example qemu+tls://remote/system \
        --verbose --compressed --live --auto-converge \
        --abort-on-error --unsafe --persistent \
        --undefinesource --copy-storage-all --xml example.xml
  Killed

  # virsh -c qemu+tls://remote/system list --all
  error: failed to connect to the hypervisor
  error: unable to connect to server at 'remote:16514': Connection refused

The crash is in:

   1531 void
   1532 qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
   1533 {
   1534     qemuDomainObjPrivatePtr priv = obj->privateData;
   1535     qemuDomainJob job = priv->job.active;
   1536
   1537     priv->jobs_queued--;

Backtrace:

  #0  at qemuDomainObjEndJob at qemu/qemu_domain.c:1537
  #1  in qemuDomainRemoveInactive at qemu/qemu_domain.c:2497
  #2  in qemuProcessAutoDestroy at qemu/qemu_process.c:5646
  #3  in virCloseCallbacksRun at util/virclosecallbacks.c:350
  #4  in qemuConnectClose at qemu/qemu_driver.c:1154
  ...

qemuDomainRemoveInactive calls virDomainObjListRemove, which in this
case is holding the last remaining reference to the domain.
qemuDomainRemoveInactive then calls qemuDomainObjEndJob, but the domain
object has been freed and poisoned by then.

This patch bumps the domain's refcount until qemuDomainRemoveInactive
has completed. We also ensure qemuProcessAutoDestroy does not return the
domain to virCloseCallbacksRun to be unlocked in this case. There is
similar logic in bhyveProcessAutoDestroy and lxcProcessAutoDestroy
(which call virDomainObjListRemove directly).
Signed-off-by: NMichael Chapman <mike@very.puzzling.org>

7578cc17

02 4月, 2015 3 次提交

Allocate virtio-serial addresses when starting a domain · 59033788

由 Ján Tomko 提交于 3月 02, 2015

Instead of always using controller 0 and incrementing port number,
respect the maximum port numbers of controllers and use all of them.

Ports for virtio consoles are quietly reserved, but not formatted
(neither in XML nor on QEMU command line).

Also rejects duplicate virtio-serial addresses.
https://bugzilla.redhat.com/show_bug.cgi?id=890606
https://bugzilla.redhat.com/show_bug.cgi?id=1076708

Test changes:
* virtio-auto.args
  Filling out the port when just the controller is specified.
  switched from using
    maxport + 1
  to:
    first free port on the controller
* virtio-autoassign.args
  Filling out the address when no <address> is specified.
  Started using all the controllers instead of 0, also discards
  the bus value.
* xml -> xml output of virtio-auto
  The port assignment is no longer done as a part of XML parsing,
  so the unspecified values stay 0.

59033788

qemu: cgroup: Use priv->autoCpuset instead of using qemuPrepareCpumap() · 98f08aba

由 Peter Krempa 提交于 3月 27, 2015

Two places would call to qemuPrepareCpumap() with priv->autoNodeset to
convert it to a cpuset. Remove the function and use the prepared cpuset
automatically.

98f08aba

P
qemu: cgroup: Store auto cpuset instead of re-creating it on demand · c9f9fa25
由 Peter Krempa 提交于 3月 27, 2015
```
The automatic cpuset can be stored along with automatic nodeset and it
does not have to be recreated when used.
```
c9f9fa25

31 3月, 2015 1 次提交

qemu: blockjob: Synchronously update backing chain in XML on ABORT/PIVOT · 630ee5ac

由 Peter Krempa 提交于 3月 30, 2015

When the synchronous pivot option is selected, libvirt would not update
the backing chain until the job was exitted. Some applications then
received invalid data as their job serialized first.

This patch removes polling to wait for the ABORT/PIVOT job completion
and replaces it with a condition. If a synchronous operation is
requested the update of the XML is executed in the job of the caller of
the synchronous request. Otherwise the monitor event callback uses a
separate worker to update the backing chain with a new job.

This is a regression since 1a92c719

When the ABORT job is finished synchronously you get the following call
stack:
 #0  qemuBlockJobEventProcess
 #1  qemuDomainBlockJobImpl
 #2  qemuDomainBlockJobAbort
 #3  virDomainBlockJobAbort

While previously or while using the _ASYNC flag you'd get:
 #0  qemuBlockJobEventProcess
 #1  processBlockJobEvent
 #2  qemuProcessEventHandler
 #3  virThreadPoolWorker

630ee5ac

26 3月, 2015 1 次提交
- J
  Rename qemuMonitorIOThreadsInfo* to qemuMonitorIOThreadInfo* · 9e48f6cf
  由 Ján Tomko 提交于 3月 25, 2015
```
It only deals with a single thread.
```
  9e48f6cf
23 3月, 2015 1 次提交

qemu: memdev: Add infrastructure to load memory device information · 5cdfaa31

由 Peter Krempa 提交于 1月 19, 2015

When using 'dimm' memory devices with qemu, some of the information
like the slot number and base address need to be reloaded from qemu
after process start so that it reflects the actual state. The state then
allows to use memory devices across migrations.

5cdfaa31

19 3月, 2015 1 次提交

util: clean up #includes of virnetdevopenvswitch.h · 451547a4

由 Laine Stump 提交于 3月 17, 2015

virnetdevopenvswitch.h declares a few functions that can be called to
add ports to and remove them from OVS bridges, and retrieve the
migration data for a port. It does not contain any data definitions
that are used by domain_conf.h. But for some reason, domain_conf.h
virnetdevopenvswitch.h should be directly #including it. This adds a
few lines to the project, but saves all the files that don't need it
from the extra computing, and makes the dependencies more clear cut.

451547a4

18 3月, 2015 2 次提交

Use PAUSED state for domains that are starting up · 18441ab9

由 Jiri Denemark 提交于 2月 16, 2015

When libvirt is starting a domain, it reports the state as SHUTOFF until
it's RUNNING. This is not ideal because domain startup may take a long
time (usually because of some configuration issues, firewalls blocking
access to network disks, etc.) and domain lists provided by libvirt look
awkward. One can see weird shutoff domains with IDs in a list of active
domains or even shutoff transient domains. In any case, it looks more
like a bug in libvirt than a normal state a domain goes through.
Signed-off-by: NJiri Denemark <jdenemar@redhat.com>

18441ab9

network: Add midonet virtual port type support to qemu · d490f47b

由 Antoni Segura Puimedon 提交于 2月 23, 2015

Use the utilities introduced in the previous patches so the qemu
driver is able to create tap devices that are bound (and unbound
on domain destroyal) to Midonet virtual ports.
Signed-off-by: NAntoni Segura Puimedon <toni+libvirt@midokura.com>

d490f47b

17 3月, 2015 1 次提交

conf: Use correct type for balloon stats period · ad69e8be

由 Martin Kletzander 提交于 3月 13, 2015

We're parsing memballoon status period as unsigned int, but when we're
trying to set it, both we and qemu use signed int. That means large
values will get wrapped around to negative one resulting in error.
Basically the same problem as commit e3a7b874 was dealing with when
updating live domain.

QEMU changed the accepted value to int64 in commit 1f9296b5, but even
values as INT_MAX don't make sense since the value passed means seconds.
Hence adding capability flag for this change isn't worth it.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140958Signed-off-by: NLuyao Huang <lhuang@redhat.com>
Signed-off-by: NMartin Kletzander <mkletzan@redhat.com>

ad69e8be

16 3月, 2015 5 次提交

Convert virDomainVcpuPinFindByVcpu into virDomainPinFindByVcpu · a8a89270

由 John Ferlan 提交于 3月 10, 2015

Since both Vcpu and IOThreads code use the same API's, alter the naming
of the API's to remove the "Vcpu" specific reference

a8a89270

Convert virDomainVcpuPinDefPtr to virDomainPinDefPtr · 59ba7023

由 John Ferlan 提交于 3月 13, 2015

As pointed out by jtomko in his review of the IOThreads pinning code:

http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html

there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...

This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...

59ba7023

conf: Replace access to def->mem.max_balloon with accessor functions · 4f9907cd

由 Peter Krempa 提交于 2月 17, 2015

As there are two possible approaches to define a domain's memory size -
one used with legacy, non-NUMA VMs configured in the <memory> element
and per-node based approach on NUMA machines - the user needs to make
sure that both are specified correctly in the NUMA case.

To avoid this burden on the user I'd like to replace the NUMA case with
automatic totaling of the memory size. To achieve this I need to replace
direct access to the virDomainMemtune's 'max_balloon' field with
two separate getters depending on the desired size.

The two sizes are needed as:
1) Startup memory size doesn't include memory modules in some
hypervisors.
2) After startup these count as the usable memory size.

Note that the comments for the functions are future aware and document
state that will be present after a few later patches.

4f9907cd

qemu: event: Don't fiddle with disk backing trees without a job · 1a92c719

由 Peter Krempa 提交于 3月 13, 2015

Surprisingly we did not grab a VM job when a block job finished and we'd
happily rewrite the backing chain data. This made it possible to crash
libvirt when queueing two backing chains tightly and other badness.

To fix it, add yet another handler to the helper thread that handles
monitor events that require a job.

1a92c719

P

qemu: process: Export qemuProcessFindDomainDiskByAlias · 5c634730
由 Peter Krempa 提交于 3月 13, 2015

5c634730

03 3月, 2015 4 次提交

qemuProcessReconnect: Fill in pid file path · 63889e0c

由 Michal Privoznik 提交于 3月 03, 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1197600

So, libvirt uses pid file to track pid of started qemus. Whenever
a domain is started, its pid is put into corresponding pid file.
The pid file path is generated based on domain name and stored
into domain object internals. However, it's not stored in the
status XML and therefore lost on daemon restarts. Hence, later,
when domain is being shut down, the daemon does not know which
pid file to unlink, and the correct pid file is left behind. To
avoid this, lets generate the pid file path again in
qemuProcessReconnect().
Reported-by: NLuyao Huang <lhuang@redhat.com>
Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>

63889e0c

qemu: check defaultMode for spice graphics independently · a16e5f0a

由 Pavel Hrdina 提交于 2月 27, 2015

Instead of checking defaultMode for every channel that has no mode
configured, test it only once outside of channel loop. This fixes a bug
that in case all possible channels are fore example set to insecure, but
defaultMode is set to secure, we wouldn't auto-generate TLS port. This
results in failure while starting a guest.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143832Signed-off-by: NPavel Hrdina <phrdina@redhat.com>

a16e5f0a

qemu: remove duplicated code for allocating spice ports · e4983952

由 Pavel Hrdina 提交于 2月 27, 2015

We have two different places that needs to be updated while touching
code for allocation spice ports.  Add a bool option to
'qemuProcessSPICEAllocatePorts' function to switch between true and fake
allocation so we can use this function also in qemu_driver to generate
native domain definition.
Signed-off-by: NPavel Hrdina <phrdina@redhat.com>

e4983952

conf: De-duplicate scheduling policy enums · 2fd5880b

由 Martin Kletzander 提交于 3月 02, 2015

Since adding the support for scheduler policy settings in commit
8680ea97, there are two enums with the same information.  That was
caused by rewriting the patch since first draft.

Find out thanks to clang, but there was no impact whatsoever.
Signed-off-by: NMartin Kletzander <mkletzan@redhat.com>

2fd5880b

21 2月, 2015 1 次提交
- P
  conf: numa: Rename virDomainNumatune to virDomainNuma · 6bc80fa8
  由 Peter Krempa 提交于 2月 11, 2015
```
The structure will gradually become the only place for NUMA related
config, thus rename it appropriately.
```
  6bc80fa8
20 2月, 2015 1 次提交

virQEMUCapsCacheLookupCopy: Pass machine type · 37cf163a

由 Michal Privoznik 提交于 2月 12, 2015

It will come handy in the near future when we will filter some
capabilities based on it.
Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>

37cf163a

19 2月, 2015 2 次提交

qemuProcessHandleBlockJob: Take status into account · 76c61cdc

由 Michal Privoznik 提交于 2月 10, 2015

Upon BLOCK_JOB_COMPLETED event delivery, we check if the job has
completed (in qemuMonitorJSONHandleBlockJobImpl()). For better image,
the event looks something like this:

"timestamp": {"seconds": 1423582694, "microseconds": 372666}, "event":
"BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len":
8412790784, "offset": 409993216, "speed": 8796093022207, "type":
"mirror", "error": "No space left on device"}}

If "len" does not equal "offset" it's considered an error, and we can
clearly see "error" field filled in. However, later in the event
processing this case was handled no differently to case of job being
aborted via separate API. It's time that we start differentiate these
two because of the future work.
Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>

76c61cdc

qemuProcessHandleBlockJob: Set disk->mirrorState more often · c37943a0

由 Michal Privoznik 提交于 2月 10, 2015

Currently, upon BLOCK_JOB_* event, disk->mirrorState is not updated
each time. The callback code handling the events checks if a blockjob
was started via our public APIs prior to setting the mirrorState.
However, some block jobs may be started internally (e.g. during
storage migration), in which case we don't bother with setting
disk->mirror (there's nothing we can set it to anyway), or other
fields. But it will come handy if we update the mirrorState in these
cases too. The event wasn't delivered just for fun - we've started the
job after all.

So, in this commit, the mirrorState is set to whatever job status
we've obtained. Of course, there are some actions on some statuses
that we want to perform. But instead of if {} else if {} else {} ...
enumeration, let's move to switch().
Signed-off-by: NMichal Privoznik <mprivozn@redhat.com>

c37943a0

13 2月, 2015 1 次提交

security: introduce virSecurityManagerCheckAllLabel function · c3d9d3bb

由 Erik Skultety 提交于 2月 12, 2015

We do have a check for valid per-domain security model, however we still
do permit an invalid security model for a domain's device (those which
are specified with <source> element).
This patch introduces a new function virSecurityManagerCheckAllLabel
which compares user specified security model against currently
registered security drivers. That being said, it also permits 'none'
being specified as a device security model.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165485Signed-off-by: NJán Tomko <jtomko@redhat.com>

c3d9d3bb

12 2月, 2015 2 次提交

qemu: fix setting of VM CPU affinity with TCG · a103bb10

由 Daniel P. Berrange 提交于 2月 10, 2015

If a previous commit I fixed the incorrect handling of vcpu pids
for TCG mode QEMU:

  commit b07f3d82
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Thu Dec 18 16:34:39 2014 +0000

    Don't setup fake CPU pids for old QEMU

    The code assumes that def->vcpus == nvcpupids, so when we setup
    fake CPU pids for old QEMU with nvcpupids == 1, we cause the
    later code to read off the end of the array. This has fun results
    like sche_setaffinity(0, ...) which changes libvirtd's own CPU
    affinity, or even better sched_setaffinity($RANDOM, ...) which
    changes the affinity of a random OS process.

The intent was that this would merely disable the ability to set
per-vCPU affinity. It should still have been possible to set VM
level host CPU affinity.

Unfortunately, when you set  <vcpu cpuset='0-1'>4</vcpu>, the XML
parser will internally take this & initialize an entry in the
def->cputune.vcpupin array for every VCPU. IOW this is implicitly
being treated as

  <cputune>
    <vcpupin cpuset='0-1' vcpu='0'/>
    <vcpupin cpuset='0-1' vcpu='1'/>
    <vcpupin cpuset='0-1' vcpu='2'/>
    <vcpupin cpuset='0-1' vcpu='3'/>
  </cputune>

Even more fun, the faked cputune elements are hidden from view when
querying the live XML, because their cpuset mask is the same as the
VM default cpumask.

The upshot was that it was impossible to set VM level CPU affinity.

To fix this we must update qemuProcessSetVcpuAffinities so that it
only reports a fatal error if the per-VCPU cpu mask is different
from the VM level cpu mask.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

a103bb10

M
qemu: Add support for setting vCPU and I/O thread scheduler setting · 104ba596
由 Martin Kletzander 提交于 1月 08, 2015
```
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986Signed-off-by: NMartin Kletzander <mkletzan@redhat.com>
```
104ba596

06 2月, 2015 1 次提交

qemu: include libvirt & QEMU versions in QEMU log files · 95fd6a91

由 Daniel P. Berrange 提交于 2月 02, 2015

It is often helpful to know which version of libvirt and QEMU
was present when a guest was first launched. Ensure this info
is written into the QEMU log file for each guest.

95fd6a91

27 1月, 2015 2 次提交
- D
  qemu: report TAP device indexes to systemd · f7afeddc
  由 Daniel P. Berrange 提交于 1月 16, 2015
```
Record the index of each TAP device created and report them to
systemd, so they show up in machinectl status for the VM.
```
  f7afeddc
- D
  Remove use of nwfilterPrivateData from nwfilter driver · 7b1ba956
  由 Daniel P. Berrange 提交于 11月 07, 2014
```
The nwfilter driver can rely on its global state instead
of the connect private data.
```
  7b1ba956
19 1月, 2015 2 次提交

Always check return value of qemuDomainObjExitMonitor · 5c703ca3

由 Ján Tomko 提交于 12月 16, 2014

Depending on the context, either error out if the domain
has disappeared in the meantime, or just ignore the value
to allow marking the function as ATTRIBUTE_RETURN_CHECK.

5c703ca3

Fix vmdef usage after domain crash in monitor on device detach · 6edb97f2

由 Ján Tomko 提交于 12月 16, 2014

https://bugzilla.redhat.com/show_bug.cgi?id=1161024

In the device type-specific functions, exit early
if the domain has disappeared, because the cleanup
should have been done by qemuProcessStop.

Check the return value in processDeviceDeletedEvent
and qemuProcessUpdateDevices.

Skip audit and removing the device from live def because
it has already been cleaned up.

6edb97f2

15 1月, 2015 1 次提交

Fix vmdef usage while in monitor in qemu process · c749eda4

由 Ján Tomko 提交于 1月 07, 2015

Make local copy of the disk alias in qemuProcessInitPasswords,
instead of referencing the one in domain definition, which
might get freed if the domain crashes while we're in monitor.

Also copy the memballoon period value.

c749eda4

14 1月, 2015 1 次提交

qemu_process: detect updated video ram size values from QEMU · ce745914

由 Pavel Hrdina 提交于 12月 10, 2014

QEMU internally updates the size of video memory if the domain XML had
provided too low memory size or there are some dependencies for a QXL
devices 'vgamem' and 'ram' size. We need to know about the changes and
store them into the status XML to not break migration or managedsave
through different libvirt versions.

The values would be loaded only if the "vgamem_mb" property exists for
the device.  The presence of the "vgamem_mb" also tells that the
"ram_size" and "vram_size" exists for QXL devices.
Signed-off-by: NPavel Hrdina <phrdina@redhat.com>

ce745914

21 12月, 2014 1 次提交

qemu: completely rework reference counting · 540c339a

由 Martin Kletzander 提交于 12月 04, 2014

There is one problem that causes various errors in the daemon. When
domain is waiting for a job, it is unlocked while waiting on the
condition. However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd. If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object. However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list. That reference can be removed when the API is waiting for
a job. And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked. This makes all reference counting
deterministic and makes the code a bit clearer.
Signed-off-by: NMartin Kletzander <mkletzan@redhat.com>

540c339a

19 12月, 2014 2 次提交

disable vCPU pinning with TCG mode · 65686e5a

由 Daniel P. Berrange 提交于 12月 18, 2014

Although QMP returns info about vCPU threads in TCG mode, the
data it returns is mostly lies. Only the first vCPU has a valid
thread_id returned. The thread_id given for the other vCPUs is
in fact the main emulator thread. All vCPUs actually run under
the same thread in TCG mode.

Our vCPU pinning code is not at all able to cope with this
so if you try to set CPU affinity per-vCPU you end up with
wierd errors

error: Failed to start domain instance-00000007
error: cannot set CPU affinity on process 24365: Invalid argument

Since few people will care about the performance of TCG with
strict CPU pinning, lets just disable that for now, so we get
a clear error message

error: Failed to start domain instance-00000007
error: Requested operation is not valid: cpu affinity is not supported

65686e5a

Don't setup fake CPU pids for old QEMU · b07f3d82

由 Daniel P. Berrange 提交于 12月 18, 2014

The code assumes that def->vcpus == nvcpupids, so when we setup
fake CPU pids for old QEMU with nvcpupids == 1, we cause the
later code to read off the end of the array. This has fun results
like sche_setaffinity(0, ...) which changes libvirtd's own CPU
affinity, or even better sched_setaffinity($RANDOM, ...) which
changes the affinity of a random OS process.

b07f3d82