提交 · aed4986322fe77bdf718e31a0587d00f04f3d97a · openeuler / libvirt

22 4月, 2013 1 次提交

Change default resource partition to /machine · aed49863

由 Daniel P. Berrange 提交于 4月 18, 2013

After discussions with systemd developers it was decided that
a better default policy for resource partitions is to have
3 default partitions at the top level

   /system   - system services
   /machine - virtual machines / containers
   /user    - user login session

This ensures that the default policy isolates guest from
user login sessions & system services, so a mis-behaving
guest can't consume 100% of CPU usage if other things are
contending for it.

Thus we change the default partition from /system to
/machine
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

aed49863

16 4月, 2013 5 次提交

Remove non-functional code for setting up non-root cgroups · 767596bd

由 Daniel P. Berrange 提交于 4月 04, 2013

The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

767596bd

Change default cgroup layout for QEMU/LXC and honour XML config · db44eb1b

由 Daniel P. Berrange 提交于 4月 03, 2013

Historically QEMU/LXC guests have been placed in a cgroup layout
that is

   $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME

This is bad for a number of reasons

 - The cgroup hierarchy gets very deep which seriously
   impacts kernel performance due to cgroups scalability
   limitations.

 - It is hard to setup cgroup policies which apply across
   services and virtual machines, since all VMs are underneath
   the libvirtd service.

To address this the default cgroup location is changed to
be

    /system/$VMNAME.{lxc,qemu}.libvirt

This puts virtual machines at the same level in the hierarchy
as system services, allowing consistent policy to be setup
across all of them.

This also honours the new resource partition location from the
XML configuration, for example

  <resource>
    <partition>/virtualmachines/production</partitions>
  </resource>

will result in the VM being placed at

    /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt

NB, with the exception of the default, /system, path which
is intended to always exist, libvirt will not attempt to
auto-create the partitions in the XML. It is the responsibility
of the admin/app to configure the partitions. Later libvirt
APIs will provide a way todo this.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

db44eb1b

Add a new virCgroupNewPartition for setting up resource partitions · aa8604dd

由 Daniel P. Berrange 提交于 3月 28, 2013

A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

aa8604dd

Rename virCgroupForXXX to virCgroupNewXXX · 04c18d25

由 Daniel P. Berrange 提交于 3月 28, 2013

Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

04c18d25

Store a virCgroupPtr instance in qemuDomainObjPrivatePtr · 632f78ca

由 Daniel P. Berrange 提交于 3月 21, 2013

Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

632f78ca

13 4月, 2013 1 次提交

QEMU Cgroup support for TPM passthrough · 22feb0d3

由 Stefan Berger 提交于 4月 12, 2013

Some refactoring for virDomainChrSourceDef type of devices so
we can use common code.
Signed-off-by: NStefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: NCorey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: NCorey Bryant <coreyb@linux.vnet.ibm.com>

22feb0d3

08 4月, 2013 1 次提交

Rename virCgroupMounted to virCgroupHasController & make it more robust · dca927c8

由 Daniel P. Berrange 提交于 3月 21, 2013

The virCgroupMounted method is badly named, since a controller can be
mounted, but disabled in the current object. Rename the method to be
virCgroupHasController. Also make it tolerant to a  NULL virCgroupPtr
and out-of-range controller index, to avoid duplication of these
checks in all callers
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

dca927c8

05 4月, 2013 1 次提交

Don't create dirs in cgroup controllers we don't want to use · 56f27b3b

由 Daniel P. Berrange 提交于 3月 21, 2013

Currently when getting an instance of virCgroupPtr we will
create the path in all cgroup controllers. Only at the virt
driver layer are we attempting to filter controllers. This
is bad because the mere act of creating the dirs in the
controllers can have a functional impact on the kernel,
particularly for performance.

Update the virCgroupForDriver() method to accept a bitmask
of controllers to use. Only create dirs in the controllers
that are requested. When creating cgroups for domains,
respect the active controller list from the parent cgroup
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

56f27b3b

20 3月, 2013 1 次提交

NUMA: cleanup for numa related codes · 45e9d27a

由 Gao feng 提交于 3月 20, 2013

Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.

This patch also moves the numa related codes to the
file virnuma.c and virnuma.h
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>

45e9d27a

28 2月, 2013 2 次提交

Don't try to add non-existant devices to ACL · 7f544a4c

由 Daniel P. Berrange 提交于 2月 27, 2013

The QEMU driver has a list of devices nodes that are whitelisted
for all guests. The kernel has recently started returning an
error if you try to whitelist a device which does not exist.
This causes a warning in libvirt logs and an audit error for
any missing devices. eg

2013-02-27 16:08:26.515+0000: 29625: warning : virDomainAuditCgroup:451 : success=no virt=kvm resrc=cgroup reason=allow vm="vm031714" uuid=9d8f1de0-44f4-a0b1-7d50-e41ee6cd897b cgroup="/sys/fs/cgroup/devices/libvirt/qemu/vm031714/" class=path path=/dev/kqemu rdev=? acl=rw
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

7f544a4c

Avoid spamming logs with cgroups warnings · 279336c5

由 Daniel P. Berrange 提交于 2月 27, 2013

The code for putting the emulator threads in a separate cgroup
would spam the logs with warnings

2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4
2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6

This is because it has only created child cgroups for 3 of the
controllers, but was trying to move the processes from all the
controllers. The fix is to only try to move threads in the
controllers we actually created. Also remove the warning and
make it return a hard error to avoid such lazy callers in the
future.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

279336c5

22 2月, 2013 1 次提交

qemu: check backing chains even when cgroup is omitted · 82d5fe54

由 Eric Blake 提交于 2月 20, 2013

https://bugzilla.redhat.com/show_bug.cgi?id=896685 points out
a regression caused by commit 38c4a9cc - libvirt only labels
the backing chain if the backing chain cache is populated, but
the code to populate the cache was only conditionally performed
if cgroup labeling was necessary.

* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Hoist cache setup...
* src/qemu/qemu_process.c (qemuProcessStart): ...earlier into
caller, where it is now unconditional.

82d5fe54

06 2月, 2013 2 次提交

D
Rename all USB device functions to have a standard name prefix · 77c3015f
由 Daniel P. Berrange 提交于 1月 14, 2013
```
Rename all the usbDeviceXXX and usbXXXDevice APIs to have a
fixed virUSBDevice name prefix
```
77c3015f

Fix leak of usbDevice struct when initializing cgroups · 3e86e8f3

由 Daniel P. Berrange 提交于 2月 05, 2013

When iterating over USB host devices to setup cgroups, the
usbDevice object was leaked in both LXC and QEMU driers
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

3e86e8f3

05 2月, 2013 1 次提交

Introduce a virQEMUDriverConfigPtr object · b090aa7d

由 Daniel P. Berrange 提交于 1月 10, 2013

Currently the virQEMUDriverPtr struct contains an wide variety
of data with varying access needs. Move all the static config
data into a dedicated virQEMUDriverConfigPtr object. The only
locking requirement is to hold the driver lock, while obtaining
an instance of virQEMUDriverConfigPtr. Once a reference is held
on the config object, it can be used completely lockless since
it is immutable.

NB, not all APIs correctly hold the driver lock while getting
a reference to the config object in this patch. This is safe
for now since the config is never updated on the fly. Later
patches will address this fully.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

b090aa7d

10 1月, 2013 1 次提交

maint: fix comment typo · 70345318

由 Eric Blake 提交于 1月 09, 2013

While OOM can have knock-on effects that trash a system, generally
the first symptom is one of memory thrashing.

* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Reword slightly.

70345318

08 1月, 2013 1 次提交

qemu: Relax hard RSS limit · 3c83df67

由 Michal Privoznik 提交于 1月 08, 2013

Currently, if there's no hard memory limit defined for a domain,
libvirt tries to calculate one, based on domain definition and magic
equation and set it upon the domain startup. The rationale behind was,
if there's a memory leak or exploit in qemu, we should prevent the
host system trashing. However, the equation was too tightening, as it
didn't reflect what the kernel counts into the memory used by a
process. Since many hosts do have a swap, nobody hasn't noticed
anything, because if hard memory limit is reached, process can
continue allocating memory on a swap. However, if there is no swap on
the host, the process gets killed by OOM killer. In our case, the qemu
process it is.

To prevent this, we need to relax the hard RSS limit. Moreover, we
should reflect more precisely the kernel way of accounting the memory
for process. That is, even the kernel caches are counted within the
memory used by a process (within cgroups at least). Hence the magic
equation has to be changed:

  limit = 1.5 * (domain memory + total video memory) + (32MB for cache
          per each disk) + 200MB

3c83df67

21 12月, 2012 5 次提交
- D
  
  Rename virterror.c virterror_internal.h to virerror.{c,h} · f24404a3
  由 Daniel P. Berrange 提交于 12月 13, 2012
  
  f24404a3
- D
  
  Rename util.{c,h} to virutil.{c,h} · 44f6ae27
  由 Daniel P. Berrange 提交于 12月 13, 2012
  
  44f6ae27
- D
  
  Rename memory.{c,h} to viralloc.{c,h} · ab9b7ec2
  由 Daniel P. Berrange 提交于 12月 12, 2012
  
  ab9b7ec2
- D
  
  Rename logging.{c,h} to virlog.{c,h} · 936d95d3
  由 Daniel P. Berrange 提交于 12月 12, 2012
  
  936d95d3
- D
  Rename cgroup.{h,c} to vircgroup.{h,c} · f9c7020c
  由 Daniel P. Berrange 提交于 12月 03, 2012
```
To bring in line with new naming practice, rename the=
src/util/cgroup.{h,c} files to vircgroup.{h,c}
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>
```
  f9c7020c
18 12月, 2012 1 次提交

Allow passing a vroot into security manager hostdev labelling · df5928ea

由 Daniel P. Berrange 提交于 11月 27, 2012

When LXC labels USB devices during hotplug, it is running in
host context, so it needs to pass in a vroot path to the
container root.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

df5928ea

29 11月, 2012 1 次提交

Replace 'struct qemud_driver *' with virQEMUDriverPtr · 4738c2a7

由 Daniel P. Berrange 提交于 11月 28, 2012

Remove the obsolete 'qemud' naming prefix and underscore
based type name. Introduce virQEMUDriverPtr as the replacement,
in common with LXC driver naming style

4738c2a7

02 11月, 2012 1 次提交

Remove spurious whitespace between function name & open brackets · 1c04f999

由 Daniel P. Berrange 提交于 10月 17, 2012

The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: NDaniel P. Berrange <berrange@redhat.com>

1c04f999

24 10月, 2012 1 次提交

qemu: Keep the affinity when creating cgroup for emulator thread · bb81021b

由 Osier Yang 提交于 10月 24, 2012

When the cpu placement model is "auto", it sets the affinity for
domain process with the advisory nodeset from numad, however,
creating cgroup for the domain process (called emulator thread
in some contexts) later overrides that with pinning it to all
available pCPUs.

How to reproduce:

  * Configure the domain with "auto" placement for <vcpu>, e.g.
    <vcpu placement='auto'>4</vcpu>
  * % virsh start dom
  * % cat /proc/$dompid/status

Though the emulator cgroup cause conflicts, but we can't simply
prohibit creating it, as other tunables are still useful, such
as "emulator_period", which is used by API
virDomainSetSchedulerParameter. So this patch doesn't prohibit
creating the emulator cgroup, but inherit the nodeset from numad,
and reset the affinity for domain process.

* src/qemu/qemu_cgroup.h: Modify definition of qemuSetupCgroupForEmulator
                          to accept the passed nodenet
* src/qemu/qemu_cgroup.c: Set the affinity with the passed nodeset

bb81021b

20 10月, 2012 2 次提交

blockjob: remove unused parameters after previous patch · 67aea3fb

由 Eric Blake 提交于 10月 17, 2012

Minor cleanup made possible by previous simplifications.

* src/qemu/qemu_cgroup.h (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Alter signature.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup, qemuSetupCgroup): Update all uses.
* src/qemu/qemu_hotplug.c (qemuDomainDetachPciDiskDevice)
(qemuDomainDetachDiskDevice): Likewise.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Likewise.

67aea3fb

storage: use cache to walk backing chain · 38c4a9cc

由 Eric Blake 提交于 10月 09, 2012

We used to walk the backing file chain at least twice per disk,
once to set up cgroup device whitelisting, and once to set up
security labeling.  Rather than walk the chain every iteration,
which possibly includes calls to fork() in order to open root-squashed
NFS files, we can exploit the cache of the previous patch.

* src/conf/domain_conf.h (virDomainDiskDefForeachPath): Alter
signature.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Require caller
to supply backing chain via disk, if recursion is desired.
* src/security/security_dac.c
(virSecurityDACSetSecurityImageLabel): Adjust caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityImageLabel): Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Likewise.
(qemuSetupCgroup): Pre-populate chain.

38c4a9cc

17 10月, 2012 1 次提交

qemu: Pin the emulator when only cpuset is specified · ba63d8f7

由 Martin Kletzander 提交于 10月 17, 2012

According to our recent changes (clarifications), we should be pinning
qemu's emulator processes using the <vcpu> 'cpuset' attribute in case
there is no <emulatorpin> specified.  This however doesn't work
entirely as expected and this patch should resolve all the remaining
issues.

ba63d8f7

11 10月, 2012 1 次提交
- J
  
  qemu: Implement startupPolicy for USB passed through devices · edc9269a
  由 Jiri Denemark 提交于 10月 04, 2012
  
  edc9269a
21 9月, 2012 1 次提交

maint: fix up copyright notice inconsistencies · 4ecb723b

由 Eric Blake 提交于 9月 20, 2012

https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.

* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/;  If/.  If/

4ecb723b

18 9月, 2012 2 次提交
- H
  
  use virBitmap to store numa nodemask info. · 75b198b3
  由 Hu Tao 提交于 9月 14, 2012
  
  75b198b3
- H
  
  use virBitmap to store cpupin info · f970d848
  由 Hu Tao 提交于 9月 14, 2012
  
  f970d848
12 9月, 2012 1 次提交
- H
  fix bug in qemuSetupCgroupForEmulator · f7e1a546
  由 Hu Tao 提交于 9月 06, 2012
```
Should not return 0 when failed to setup cgroup.
```
  f7e1a546
06 9月, 2012 1 次提交

qemu: don't pin all the cpus · 9f86fb93

由 Martin Kletzander 提交于 9月 04, 2012

This is another fix for the emulator-pin series. When going through
the cputune pinning settings, the current code is trying to pin all
the CPUs, even when not all of them are specified. This causes error
in the subsequent function which, of course, cannot find the cpu to
pin. Since it's enough to pass the correct VCPU ID to the function,
the fix is trivial.

9f86fb93

31 8月, 2012 1 次提交

qemu: Don't ignore CPU tuning config if required cgroups are missing · 774eb45b

由 Jiri Denemark 提交于 8月 29, 2012

When domain XML contains any of the elements for setting up CPU
scheduling parameters (period, quota, emulator_period, or
emulator_quota) we need cpu cgroup to enforce the configuration.
However, the existing code would just ignore silently such settings if
either cgroups were not available at all cpu cgroup was not available.
Moreover, APIs for manipulating CPU scheduler parameters were already
failing if cpu cgroup was not available. This patch makes cpu cgroup
mandatory for all domains that use CPU scheduling elements in their XML.

774eb45b

29 8月, 2012 1 次提交

qemu: Fix starting domains with no cpu cgroup · 0c7cca36

由 Jiri Denemark 提交于 8月 29, 2012

If cgroups are enabled in general but cpu cgroup is disabled in
qemu.conf or not mounted at all, libvirt would refuse to start any
domain even though scheduler parameters are not set in domain XML.

This patch makes cpu cgroup mandatory only for domains that actually
want to use it.

0c7cca36

27 8月, 2012 1 次提交

qemu: fix regression with pinning · 16ebec2b

由 Martin Kletzander 提交于 8月 24, 2012

Commit 4b03d591 changed the pinning
behavior in a way that makes some machines non-startable.

The comment mentioning that we cannot control each vcpu when there is
not VCPU<-> PID mapping available is true, however, this isn't
necessarily an error, because this can be caused by old QEMU without
support for "query-cpus" command as well as a software emulated
machines that don't create more than one process.

16ebec2b

22 8月, 2012 1 次提交

qemu: introduce period/quota tuning for emulator · b65dafa8

由 Hu Tao 提交于 8月 21, 2012

This patch introduces support of setting emulator's period and
quota to limit cpu bandwidth when the vm starts.  Also updates
XML Schema for new entries and docs.

b65dafa8