提交 · 811143c0b6f36ab4724c8b011e365609cac1a8aa · openeuler / libvirt

30 4月, 2013 2 次提交

qemu: put usb cgroup setup in common function · 811143c0

由 Laine Stump 提交于 4月 29, 2013

The USB-specific cgroup setup had been inserted inline in
qemuDomainAttachHostUsbDevice and qemuSetupCgroup, but now there is a
common cgroup setup function called for all hostdevs, so it makes sens
to put the usb-specific setup there and just rely on that function
being called.

The one thing I'm uncertain of here (and a reason for not pushing
until after release) is that previously hostdev->missing was checked
only when starting a domain (and cgroup setup for the device skipped
if missing was true), but with this consolidation, it is now checked
in the case of hotplug as well. I don't know if this will have any
practical effect (does it make sense to hotplug a "missing" usb
device?)

811143c0

qemu: add vfio devices to cgroup ACL when appropriate · 6e13860c

由 Laine Stump 提交于 4月 29, 2013

PCIO device assignment using VFIO requires read/write access by the
qemu process to /dev/vfio/vfio, and /dev/vfio/nn, where "nn" is the
VFIO group number that the assigned device belongs to (and can be
found with the function virPCIDeviceGetVFIOGroupDev)

/dev/vfio/vfio can be accessible to any guest without danger
(according to vfio developers), so it is added to the static ACL.

The group device must be dynamically added to the cgroup ACL for each
vfio hostdev in two places:

1) for any devices in the persistent config when the domain is started
   (done during qemuSetupCgroup())

2) at device attach time for any hotplug devices (done in
   qemuDomainAttachHostDevice)

The group device must be removed from the ACL when a device it
"hot-unplugged" (in qemuDomainDetachHostDevice())

Note that USB devices are already doing their own cgroup setup and
teardown in the hostdev-usb specific function. I chose to make the new
functions generic and call them in a common location though. We can
then move the USB-specific code (which is duplicated in two locations)
to this single location. I'll be posting a followup patch to do that.

6e13860c

29 4月, 2013 1 次提交
- J
  qemu: honor allowDiskFormatProbing when parsing command line · dfb48349
  由 Ján Tomko 提交于 4月 29, 2013
```
My commit 024e9af3 broke this.
```
  dfb48349
27 4月, 2013 9 次提交

J
conf: add missing error on OOM · d0f7fd99
由 Ján Tomko 提交于 4月 26, 2013
```
I removed it in 5c3d5b22 by accident.
```
d0f7fd99

qemu: prevent invalid reads in qemuAssignDevicePCISlots · 379e4bcc

由 Ján Tomko 提交于 4月 26, 2013

Don't reserve slot 2 for video if the machine has no PCI buses.
Error out when the user specifies a video device without
a PCI address when there are no PCI buses.

(This wouldn't work on a machine with no PCI bus anyway since
we do add PCI addresses for video devices to the command line)

379e4bcc

qemu: don't always reserve PCI addresses for implicit controllers · 877bc089

由 Ján Tomko 提交于 4月 26, 2013

In the past we automatically added a USB controller and assigned
it a PCI address (0:0:1.2) even on machines without a PCI bus.
This didn't break machines with no PCI bus  because the command
line for it is just '-usb', with no mention of the PCI bus.

The implicit IDE controller (reserved address 0:0:1.1) has
no command line at all.

Commit b33eb0dc removed the ability to reserve PCI addresses
on machines without a PCI bus. This made them stop working,
since there would always be the implicit USB controller.

Skip the reservation of addresses for these controllers when
there is no PCI bus, instead of failing.

877bc089

conf: remove extraneous _TYPE from driver backend enums · 19635f7d

由 Laine Stump 提交于 4月 26, 2013

This isn't strictly speaking a bugfix, but I realized I'd gotten a bit
too verbose when I chose the names for
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_*. This shortens them all a bit.

19635f7d

network: support <driver name='vfio'/> in network definitions · d64e114f

由 Laine Stump 提交于 4月 26, 2013

I remembered to document this bit, but somehow forgot to implement it.

This adds <driver name='kvm|vfio'/> as a subelement to the <forward>
element of a network (this puts it parallel to the match between
mode='hostdev' attribute in a network and type='hostdev' in an
<interface>).

Since it's already documented, only the parser, formatter, backend
driver recognition (it just translates/moves the flag into the
<interface> at the appropriate time), and a test case were needed.

(I used a separate enum for the values both because the original is
defined in domain_conf.h, which is unavailable from network_conf.h,
and because in the future it's possible that we may want to support
other non-hostdev oriented driver names in the network parser; this
makes sure that one can be expanded without the other).

d64e114f

qemu: launch bridge helper from libvirtd · 2d80fbb1

由 Paolo Bonzini 提交于 4月 20, 2013

<source type='bridge'> uses a helper application to do the necessary
TUN/TAP setup to use an existing network bridge, thus letting
unprivileged users use TUN/TAP interfaces.

However, libvirt should be preventing QEMU from running any setuid
programs at all, which would include this helper program.  From
a security POV, any setuid helper needs to be run by libvirtd itself,
not QEMU.

This is what this patch does.  libvirt now invokes the setuid helper,
gets the TAP fd and then passes it to QEMU in the normal manner.
The path to the helper is specified in qemu.conf.

As a small advantage, this adds a <target dev='tap0'/> element to the
XML of an active domain using <interface type='bridge'>.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2d80fbb1

virnetdevtap: add virNetDevTapGetName · 740d98a1

由 Paolo Bonzini 提交于 4月 20, 2013

This will be used on a tap file descriptor returned by the bridge helper
to populate the <target> element, because the helper does not provide
the interface name.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

740d98a1

qemu: don't assign a PCI address to 'none' USB controller · a12475bd

由 Ján Tomko 提交于 4月 26, 2013

Adjust the usb-none test, since it gives the memballoon a lower PCI slot now.
Add a test for 'none' controller on s390, which doesn't have PCI buses.

a12475bd

fix segfault during virsh save in pv guest · 91d1911c

由 Bamvor Jian Zhang 提交于 4月 26, 2013

this patch fix the wrong sequence for fd and timeout register. the sequence
was right in dfa1e1dd for fd register, but it changed in e0622ca2.
in this patch, set priv, xl_priv in info and increase info->priv ref count
before virEventAddHandle. if do this after virEventAddHandle, the fd
callback or fd deregister maybe got the empty priv, xl_priv or wrong ref
count.

after apply this patch, test more than 100 rounds passed compare to fail
within 3 rounds without this patch. each round includes define -> start ->
destroy -> create -> suspend -> resume -> reboot -> shutdown -> save ->
resotre -> dump -> destroy -> create -> setmem -> setvcpus -> destroy.
Signed-off-by: NBamvor Jian Zhang <bjzhang@suse.com>

91d1911c

26 4月, 2013 26 次提交

qemu: set qemu process' RLIMIT_MEMLOCK when VFIO is used · 93958945

由 Laine Stump 提交于 4月 25, 2013

VFIO requires all of the guest's memory and IO space to be lockable in
RAM. The domain's max_balloon is the maximum amount of memory the
domain can have (in KiB). We add a generous 1GiB to that for IO space
(still much better than KVM device assignment, where the KVM module
actually *ignores* the process limits and locks everything anyway),
and convert from KiB to bytes.

In the case of hotplug, we are changing the limit for the already
existing qemu process (prlimit() is used under the hood), and for
regular commandline additions of vfio devices, we schedule a call to
setrlimit() that will happen after the qemu process is forked.

93958945

qemu: use new virCommandSetMax(Processes|Files) · 7bdf459d

由 Laine Stump 提交于 4月 25, 2013

These were previously being set in a custom hook function, but now
that virCommand directly supports setting them, we can eliminate that
part of the hook and call the APIs directly.

7bdf459d

util: new virCommandSetMax(MemLock|Processes|Files) · 776d49f4

由 Laine Stump 提交于 4月 25, 2013

This patch adds two sets of functions:

1) lower level virProcessSet*() functions that will immediately set
the RLIMIT_MEMLOCK. RLIMIT_NPROC, or RLIMIT_NOFILE of either the
current process (using setrlimit()) or any other process (using
prlimit()). "current process" is indicated by passing a 0 for pid.

2) functions for virCommand* that will setup a virCommand object to
set those limits at a later time just after it has forked a new
process, but before it execs the new program.

configure.ac has prlimit and setrlimit added to the list of functions
to check for, and the low level functions log an "unsupported" error)
on platforms that don't support those functions.

776d49f4

Do proper escaping of cgroup resource partitions · f3662737

由 Daniel P. Berrange 提交于 4月 26, 2013

If a user cgroup name begins with "cgroup.", "_" or with any of
the controllers from /proc/cgroups followed by a dot, then they
need to be prefixed with a single underscore. eg if there is
an object "cpu.service", then this would end up as "_cpu.service"
in the cgroup filesystem tree, however, "waldo.service" would
stay "waldo.service", at least as long as nobody comes up with
a cgroup controller called "waldo".

Since we require a '.XXXX' suffix on all partitions, there is
no scope for clashing with the kernel 'tasks' and 'release_agent'
files.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

f3662737

Ensure all cgroup partitions have a suffix of ".partition" · 9ddfe7ee

由 Daniel P. Berrange 提交于 4月 26, 2013

If the partition named passed in the XML does not already have
a suffix, ensure it gets a '.partition' added to each component.
The exceptions are /machine, /user and /system which do not need
to have a suffix, since they are fixed partitions at the top
level.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

9ddfe7ee

Change VM cgroup suffix from '{lxc,qemu}.libvirt' to 'libvirt-{lxc,qemu}' · 824e86e7

由 Daniel P. Berrange 提交于 4月 26, 2013

Recently we changed to create VM cgroups with the naming pattern
$VMNAME.$DRIVER.libvirt. Following discussions with the systemd
community it was decided that only having a single '.' in the
names is preferrable. So this changes the naming scheme to be
$VMNAME.libvirt-$DRIVER. eg for LXC 'mycontainer.libvirt-lxc' or
for KVM 'myvm.libvirt-qemu'.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

824e86e7

test: Add JSON test for query-tpm-types · 7e77f252

由 Stefan Berger 提交于 4月 26, 2013

Add a test case for query-tpm-models QMP command.
Signed-off-by: NStefan Berger <stefanb@linux.vnet.ibm.com>

7e77f252

virsh: suppress aliases in group help · 117dc4cc

由 Eric Blake 提交于 4月 26, 2013

'virsh help | grep nodedev-det' shows only nodedev-detach, but
'virsh help nodedev | grep nodedev-det' also shows the old alias
nodedev-dettach that we intentionally hid in commit af3f9aab.

See also commit 787f4feb and this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=956966

* tools/virsh.c (vshCmdGrpHelp): Copy suppression of vshCmdHelp.
Signed-off-by: NEric Blake <eblake@redhat.com>

117dc4cc

security: update hostdev labelling functions for VFIO · f0bd70a9

由 Laine Stump 提交于 4月 25, 2013

Legacy kvm style pci device assignment requires changes to the
labelling of several sysfs files for each device, but for vfio device
assignment, the only thing that needs to be relabelled/chowned is the
"group" device for the group that contains the device to be assigned.

f0bd70a9

util: new function virPCIDeviceGetVFIOGroupDev · b210208f

由 Laine Stump 提交于 4月 25, 2013

Given a virPCIDevice, this function returns the path for the device
that controls the vfio group the device belongs to,
e.g. "/dev/vfio/15".

b210208f

virsh: use new virNodeDeviceDetachFlags · d923f6c8

由 Laine Stump 提交于 4月 24, 2013

The virsh nodedev-detach command has a new --driver option. If it's
given virsh will attempt to use the new virNodeDeviceDetachFlags API
instead of virNodeDeviceDettach. Validation of the driver name string
is left to the hypervisor (qemu accepts "kvm" or "vfio". The only
other hypervisor that implements these functions is xen, and it only
accepts NULL).

d923f6c8

xen: implement virNodeDeviceDetachFlags backend · cad14a52

由 Laine Stump 提交于 4月 24, 2013

This was the only hypervisor driver other than qemu that implemented
virNodeDeviceDettach. It doesn't currently support multiple pci device
assignment driver backends, but it is simple to plug in this new API,
which will make it easier for Xen people to fill it in later when they
decide to support VFIO (or whatever other) device assignment. Also it
means that management applications will have the same API available to
them for both hypervisors on any given version of libvirt.

The only acceptable value for driverName in this case is NULL, since
there is no alternate, and I'm not willing to pick a name for the
default driver used by Xen.

cad14a52

qemu: implement virNodeDeviceDetachFlags backend · eaff1611

由 Laine Stump 提交于 4月 24, 2013

The differences from virNodeDeviceDettach are very minor:

1) Check that the flags are 0.

2) Set the virPCIDevice's stubDriver according to the driverName that
   is passed in.

3) Call virPCIDeviceDetach with a NULL stubDriver, indicating it
   should get the name of the stub driver from the virPCIDevice
   object.

eaff1611

hypervisor api: implement RPC calls for virNodeDeviceDetachFlags · cc875b83

由 Laine Stump 提交于 4月 24, 2013

This requires a custom function for remoteNodeDeviceDetachFlags,
because it is named *NodeDevice, but it goes through the hypervisor
driver rather than nodedevice driver, and so it uses privateData
instead of nodeDevicePrivateData. (It has to go through the hypervisor
driver, because that is the driver that knows about the backend drivers
that will perform the pci device assignment).

cc875b83

hypervisor api: new virNodeDeviceDetachFlags · 35394196

由 Laine Stump 提交于 4月 24, 2013

The existing virNodeDeviceDettach() assumes that there is only a
single PCI device assignment backend driver appropriate for any
hypervisor. This is no longer true, as the qemu driver is getting
support for PCI device assignment via VFIO. The new API
virNodeDeviceDetachFlags adds a driverName arg that should be set to
the exact same string set in a domain <hostdev>'s <driver name='x'/>
element (i.e. "vfio", "kvm", or NULL for default). It also adds a
flags arg for good measure (and because it's possible we may need it
when we start dealing with VFIO's "device groups").

35394196

qemu: bind/unbind stub driver according to config <driver name='x'/> · cc0a9188

由 Laine Stump 提交于 4月 23, 2013

If the config for a device has specified <driver name='vfio'/>,
"backend" in the pci part of the hostdev object will be set to
..._VFIO. In this case, when creating a virPCIDevice set the
stubDriver to "vfio-pci", otherwise set it to "pci-stub". We will rely
on the lower levels to report an error if the vfio driver isn't
loaded.

The detach/attach functions in virpci.c will pay attention to the
stubDriver setting in the device, and bind/unbind the appropriate
driver when preparing hostdevs for the domain.

Note that we don't yet attempt to do anything to mark active any other
devices in the same vfio "group" as a single device that is being
marked active. We do need to do that, but in order to get basic VFIO
functionality testing sooner rather than later, initially we'll just
live with more cryptic errors when someone tries to do that.

cc0a9188

pci: keep a stubDriver in each virPCIDevice · be64199e

由 Laine Stump 提交于 4月 23, 2013

This can be set when the virPCIDevice is created and placed on a list,
then used later when traversing the list to determine which stub
driver to bind/unbind for managed devices.

The existing Detach and Attach functions' signatures haven't been
changed (they still accept a stub driver name in the arg list), but if
the arg list has NULL for stub driver and one is available in the
device's object, that will be used. (we may later deprecate and remove
the arg from those functions).

be64199e

qemu: use vfio-pci on commandline when appropriate · 731b0f36

由 Laine Stump 提交于 4月 25, 2013

The device option for vfio-pci is nearly identical to that for
pci-assign - only the configfd parameter isn't supported (or needed).

Checking for presence of the bootindex parameter is done separately
from constructing the commandline, similar to how it is done for
pci-assign.

This patch contains tests to check for proper commandline
construction. It also includes tests for parser-formatter-parser
roundtrips (xml2xml), because those tests use the same data files, and
would have failed had they been included before now.

qemu: xml/args tests for VFIO hostdev and <interface type='hostdev'/>

These should be squashed in with the patch that adds commandline
handling of vfio (they would fail at any earlier time).

731b0f36

conf: formatter/parser/RNG/docs for hostdev <driver name='kvm|vfio'/> · c4f63ef0

由 Laine Stump 提交于 3月 15, 2013

A domain's <interface> or <hostdev>, as well as a <network>'s
<forward>, can now have an optional <driver name='kvm|vfio'/>
element. As of this patch, there is no functionality behind this new
knob - this patch adds support to the domain and network
formatter/parser, and to the RNG and documentation.

When the backend is added, legacy KVM PCI device assignment will
continue to be used when no driver name is specified (or if <driver
name='kvm'/> is specified), but if driver name is 'vfio', the new UEFI
Secure Boot compatible VFIO device assignment will be used.

Note that the parser doesn't automatically insert the current default
value of this setting. This is done on purpose because the two
possibilities are functionally equivalent from the guest's point of
view, and we want to be able to automatically start using vfio as the
default (even for existing domains) at some time in the future. This
is similar to what was done with the "vhost" driver option in
<interface>.

c4f63ef0

conf: put hostdev pci address in a struct · 9f80fc1b

由 Laine Stump 提交于 3月 18, 2013

There will soon be other items related to pci hostdevs that need to be
in the same part of the hostdevsubsys union as the pci address (which
is currently a single member called "pci". This patch replaces the
single member named pci with a struct named pci that contains a single
member named "addr".

9f80fc1b

qemu: detect vfio-pci device and its bootindex parameter · 5b90ef08

由 Laine Stump 提交于 4月 17, 2013

QEMU_CAPS_DEVICE_VFIO_PCI is set if the device named "vfio-pci" is
supported in the qemu binary.

QEMU_CAPS_VFIO_PCI_BOOTINDEX is set if the vfio-pci device supports
the "bootindex" parameter;  for some reason, the bootindex parameter
wasn't included in early versions of vfio support (qemu 1.4) so we
have to check for it separately from vfio itself.

5b90ef08

build: avoid unsafe functions in libgen.h · 1fbf1905

由 Eric Blake 提交于 4月 25, 2013

POSIX says that both basename() and dirname() may return static
storage (aka they need not be thread-safe); and that they may but
not must modify their input argument.  Furthermore, <libgen.h>
is not available on all platforms.  For these reasons, you should
never use these functions in a multi-threaded library.

Gnulib instead recommends a way to avoid the portability nightmare:
gnulib's "dirname.h" provides useful thread-safe counterparts.  The
obvious dir_name() and base_name() are GPL (because they malloc(),
but call exit() on failure) so we can't use them; but the LGPL
variants mdir_name() (malloc's or returns NULL) and last_component
(always points into the incoming string without modifying it,
differing from basename semantics only on corner cases like the
empty string that we shouldn't be hitting in the first place) are
already in use in libvirt.  This finishes the swap over to the safe
functions.

* cfg.mk (sc_prohibit_libgen): New rule.
* src/util/vircgroup.c: Fix offenders.
* src/parallels/parallels_storage.c (parallelsPoolAddByDomain):
Likewise.
* src/parallels/parallels_network.c (parallelsGetBridgedNetInfo):
Likewise.
* src/node_device/node_device_udev.c (udevProcessSCSIHost)
(udevProcessSCSIDevice): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskDeleteVol): Likewise.
* src/util/virpci.c (virPCIGetDeviceAddressFromSysfsLink):
Likewise.
* src/util/virstoragefile.h (_virStorageFileMetadata): Avoid false
positive.
Signed-off-by: NEric Blake <eblake@redhat.com>

1fbf1905

C
Fix VIR_DOMAIN_EVENT_ID_PMSUSPEND capitalization in API doc · 09c9395a
由 Christophe Fergeau 提交于 4月 25, 2013
```
It was written VIR_DOMAIN_EVENT_ID_PMSuspend
```
09c9395a
C
Improve /domainsnapshot/disks/disk@snapshot doc · cc6d19f3
由 Christophe Fergeau 提交于 4月 25, 2013
```
The previous description was a bit confusing.
```
cc6d19f3

qemu: fix build error with older platforms · b121584f

由 Eric Blake 提交于 4月 25, 2013

Jim Fehlig reported on IRC that older gcc/glibc triggers this warning:

cc1: warnings being treated as errors
qemu/qemu_domain.c: In function 'qemuDomainDefFormatBuf':
qemu/qemu_domain.c:1297: error: declaration of 'remove' shadows a global declaration [-Wshadow]
/usr/include/stdio.h:157: error: shadowed declaration is here [-Wshadow]
make[3]: *** [libvirt_driver_qemu_impl_la-qemu_domain.lo] Error 1

Fix it like we have done in the past (such as commit 2e6322a7).

* src/qemu/qemu_domain.c (qemuDomainDefFormatBuf): Avoid shadowing
a function name.
Signed-off-by: NEric Blake <eblake@redhat.com>

b121584f

docs: fix memballoon examples · caf659a8

由 Ján Tomko 提交于 4月 25, 2013

Use a pair of 'memballoon' tags instead of single 'watchdog' one.
Add a few missing colons.

caf659a8

25 4月, 2013 2 次提交
- J
  conf: reject controllers with duplicate indexes · 2bbbf0be
  由 Ján Tomko 提交于 4月 23, 2013
```
Reject multiple controllers with the same index,
except for USB controllers.
Multi-function USB controllers can have the same index.
```
  2bbbf0be
- J
  qemu: auto-add pci-root to 'pc-i440*' machines too · 5c9cffea
  由 Ján Tomko 提交于 4月 25, 2013
```
Commit b33eb0dc missed this machine type.
```
  5c9cffea