1. 24 7月, 2010 4 次提交
    • C
      Qemu Monitor API entry point. · 21adf03c
      Chris Lalancette 提交于
      Add the library entry point for the new virDomainQemuMonitorCommand()
      entry point.  Because this is not part of the "normal" libvirt API,
      it gets its own header file, library file, and will eventually
      get its own over-the-wire protocol later in the series.
      
      Changes since v1:
       - Go back to using the virDriver table for qemuDomainMonitorCommand, due to
         linking issues
       - Added versioning information to the libvirt-qemu.so
      
      Changes since v2:
       - None
      
      Changes since v3:
       - Add LGPL header to libvirt-qemu.c
       - Make virLibConnError and virLibDomainError macros instead of function calls
      
      Changes since v4:
       - Move exported symbols to libvirt_qemu.syms
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      21adf03c
    • C
      Handle arbitrary qemu command-lines in qemuParseCommandLine. · ae027de3
      Chris Lalancette 提交于
      Now that we have the ability to specify arbitrary qemu
      command-line parameters in the XML, use it to handle unknown
      command-line parameters when doing a native-to-xml conversion.
      
      Changes since v1:
       - Rename num_extra to num_args
       - Fix up a memory leak on an error path
      
      Changes since v2:
       - Add a VIR_WARN when adding the argument via qemu:arg
      
      Changes since v3:
       - None
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      ae027de3
    • C
      Qemu arbitrary command-line arguments. · 869939a5
      Chris Lalancette 提交于
      Implement the qemu hooks for XML namespace data.  This
      allows us to specify a qemu XML namespace, and then
      specify:
      
      <qemu:commandline>
       <qemu:arg value='arg'/>
       <qemu:env name='name' value='value'/>
      </qemu:commandline>
      
      In the domain XML.
      
      Changes since v1:
       - Change the <qemu:arg>arg</qemu:arg> XML to <qemu:arg value='arg'/> XML
       - Fix up some memory leaks in qemuDomainDefNamespaceParse
       - Rename num_extra and extra to num_args and args, respectively
       - Fixed up some error messages
       - Make sure to escape user-provided data in qemuDomainDefNamespaceFormatXML
      
      Changes since v2:
       - Add checking to ensure environment variable names are valid
       - Invert the logic in qemuDomainDefNamespaceFormatXML to return early
      
      Changes since v3:
       - Change strspn() to c_isalpha() check of first letter of environment variable
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      869939a5
    • C
      Add namespace callback hooks to domain_conf. · d55b7345
      Chris Lalancette 提交于
      This patch adds namespace XML parsers to be hooked into
      the main domain parser.  This allows for individual hypervisor
      drivers to add per-namespace XML into the main domain XML.
      
      Changes since v1:
       - Use a statically declared table for caps->ns, removing the need to
         allocate/free it.
      
      Changes since v2:
       - None
      
      Changes since v3:
       - None
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      d55b7345
  2. 23 7月, 2010 5 次提交
    • C
      pciSharesBusWithActive fails to find multiple devices on bus · f4828ca3
      Chris Wright 提交于
      The first conditional is always true which means the iterator will
      never find another device on the same bus.
      
          if (dev->domain != check->domain ||
              dev->bus != check->bus ||
        ----> (check->slot == check->slot &&
               check->function == check->function)) <-----
      
      The goal of that check is to verify that the device is either:
      
        in a different pci domain
        on a different bus
        is the same identical device
      
      This means libvirt may issue a secondary bus reset when there are
      devices
      on that bus that actively in use by the host or another guest.
      
      * src/util/pci.c: fix a bogus test in pciSharesBusWithActive()
      f4828ca3
    • D
      Fix incorrect use of private data in remote driver · 8d4f0242
      Daniel P. Berrange 提交于
      The remote driver is using the wrong privateData field in
      a couple of functions. THis is harmless for stateful
      drivers like QEMU/UML/LXC, but will crash with Xen
      
      * src/remote/remote_driver.c: Fix use of privateData field
      8d4f0242
    • D
      Set a stable & high MAC addr for guest TAP devices on host · 6ea90b84
      Daniel P. Berrange 提交于
      A Linux software bridge will assume the MAC address of the enslaved
      interface with the numerically lowest MAC addr. When the bridge
      changes MAC address there is a period of network blackout, so a
      change should be avoided. The kernel gives TAP devices a completely
      random MAC address. Occassionally the random TAP device MAC is lower
      than that of the physical interface (eth0, eth1etc) that is enslaved,
      causing the bridge to change its MAC.
      
      This change sets an explicit MAC address for all TAP devices created
      using the configured MAC from the XML, but with the high byte set
      to 0xFE. This should ensure TAP device MACs are higher than any
      physical interface MAC.
      
      * src/qemu/qemu_conf.c, src/uml/uml_conf.c: Pass in a MAC addr
        for the TAP device with high byte set to 0xFE
      * src/util/bridge.c, src/util/bridge.h: Set a MAC when creating
        the TAP device to override random MAC
      6ea90b84
    • D
      Fix PCI address assignment if no IDE controller is present · 020d2204
      Daniel P. Berrange 提交于
      The PCI slot 1 must be reserved at all times, since PIIX3 is
      always present, even if no IDE device is in use for guest disks
      
      * src/qemu/qemu_conf.c: Always reserve slot 1 for PIIX3
      020d2204
    • R
      lxc: force kill of init process by sending SIGKILL if needed · 7af5f468
      Ryota Ozaki 提交于
      Init process may remain after sending SIGTERM for some reason.
      For example, if original init program is used, it is definitely
      not killed by SIGTERM.
      
      * src/lxc/lxc_controller.c: kill with SIGKILL if SIGTERM wasn't
        sufficient
      7af5f468
  3. 22 7月, 2010 4 次提交
    • L
      Remove erroneous setting of return value to errno. · ae3d31bf
      Laine Stump 提交于
      One error exit in virStorageBackendCreateBlockFrom was setting the
      return value to errno. The convention for volume build functions is to
      return 0 on success or -1 on failure. Not only was it not necessary to
      set the return value (it defaults to -1, and is set to 0 when
      everything has been successfully completed), in the case that some
      caller were checking for < 0 rather than != 0, they would incorrectly
      believe that it completed successfully.
      ae3d31bf
    • L
      Change virDirCreate to return -errno on failure. · 3e0f05fc
      Laine Stump 提交于
      virDirCreate also previously returned 0 on success and errno on
      failure. This makes it fit the recommended convention of returning 0
      on success, -errno (ie a negative number) on failure.
      3e0f05fc
    • L
      Make virStorageBackendCopyToFD return -errno. · ace1a2ba
      Laine Stump 提交于
      Previously virStorageBackendCopyToFD would simply return -1 on
      error. This made the error return from one of its callers inconsistent
      (createRawFileOpHook is supposed to return -errno, but if
      virStorageBackendCopyToFD failed, createRawFileOpHook would just
      return -1). Since there is a useful errno in every case of error
      return from virStorageBackendCopyToFD, and since the other uses of
      that function ignore the return code (beyond simply checking to see if
      it is < 0), this is a safe change.
      ace1a2ba
    • L
      Change virFileOperation to return -errno (ie < 0) on error. · 2ad04f78
      Laine Stump 提交于
      virFileOperation previously returned 0 on success, or the value of
      errno on failure. Although there are other functions in libvirt that
      use this convention, the preferred (and more common) convention is to
      return 0 on success and -errno (or simply -1 in some cases) on
      failure. This way the check for failure is always (ret < 0).
      
      * src/util/util.c - change virFileOperation and virFileOperationNoFork to
                          return -errno on failure.
      
      * src/storage/storage_backend.c, src/qemu/qemu_driver.c
        - change the hook functions passed to virFileOperation to return
          -errno on failure.
      2ad04f78
  4. 21 7月, 2010 5 次提交
    • D
      Re-arrange PCI device address assignment to match QEMU's default · 0e308c2c
      Daniel P. Berrange 提交于
      To try and ensure that people upgrading from old QEMU get guests
      with the same PCI device ordering, change the way we assign addrs
      to match QEMU's default order. This should make Windows less
      annoyed.
      
      * src/qemu/qemu_conf.c: Follow QEMU's default PCI ordering
        logic when assigning addresses
      * tests/*.args: Update for changed PCI addresses
      0e308c2c
    • D
      Explicitly represent balloon device in XML and handle PCI address · b2f18635
      Daniel P. Berrange 提交于
      To allow compatibility with older QEMU PCI device slot assignment
      it is necessary to explicitly track the balloon device in the
      XML. This introduces a new device
      
         <memballoon model='virtio|xen'/>
      
      It can also have a PCI address, auto-assigned if necessary.
      
      The memballoon will be automatically added to all Xen and QEMU
      guests by default.
      
      * docs/schemas/domain.rng: Add <memballoon> element
      * src/conf/domain_conf.c, src/conf/domain_conf.h: parsing
        and formatting for memballoon device. Always add a memory
        balloon device to Xen/QEMU if none exists in XML
      * src/libvirt_private.syms: Export memballoon model APIs
      * src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Honour the
        PCI device address in memory balloon device
      * tests/*: Update to test new functionality
      b2f18635
    • D
      Rearrange VGA/IDE controller address reservation · ccd2c82e
      Daniel P. Berrange 提交于
      The first VGA and IDE devices need to have fixed PCI address
      reservations. Currently this is handled inline with the other
      non-primary VGA/IDE devices. The fixed virtio balloon device
      at slot 3, ensures auto-assignment skips the slots 1/2. The
      virtio address will shortly become configurable though. This
      means the reservation of fixed slots needs to be done upfront
      to ensure that they don't get re-used for other devices.
      
      This is more or less reverting the previous changeset:
      
        commit 83acdeaf
        Author: Daniel P. Berrange <berrange@redhat.com>
        Date:   Wed Feb 3 16:11:29 2010 +0000
      
        Fix restore of QEMU guests with PCI device reservation
      
      The difference is that this time, instead of unconditionally
      reserving the address, we only reserve the address if it was
      initially type=none. Addresses of type=pci were handled
      earlier in process by qemuDomainPCIAddressSetCreate(). This
      ensures restore step doesn't have problems
      
      * src/qemu/qemu_conf.c: Reserve first VGA + IDE address
        upfront
      ccd2c82e
    • D
      Remove inappropriate use of VIR_ERR_NO_SUPPORT · 021251bd
      Daniel P. Berrange 提交于
      The VIR_ERR_NO_SUPPORT refers to an API which is not implemented.
      There is a separate VIR_ERR_CONFIG_UNSUPPORTED for XML config
      options that are not available with the current hypervisor.
      
      * src/qemu/qemu_conf.c, src/qemu/qemu_driver.c: Remove
        many VIR_ERR_NO_SUPPORT replace with VIR_ERR_CONFIG_UNSUPPORTED
      021251bd
    • D
      Remove bogus free of static strings · 4d134188
      Daniel P. Berrange 提交于
      Remove bogus free of statically allocated strings introduced
      in 03ca4204
      
      * src/conf/capabilities.c: Don't free static strings for
        default disk driver type/name
      4d134188
  5. 20 7月, 2010 11 次提交
    • C
      Fix a deadlock in bi-directional p2p concurrent migration. · f0c8e1cb
      Chris Lalancette 提交于
      If you try to execute two concurrent migrations p2p
      from A->B and B->A, the two libvirtd's will deadlock
      trying to perform the migrations.  The reason for this is
      that in p2p migration, the libvirtd's are responsible for
      making the RPC Prepare, Migrate, and Finish calls.  However,
      they are currently holding the driver lock while doing so,
      which basically guarantees deadlock in this scenario.
      
      This patch fixes the situation by adding
      qemuDomainObjEnterRemoteWithDriver and
      qemuDomainObjExitRemoteWithDriver helper methods.  The Enter
      take an additional object reference, then drops both the
      domain object lock and the driver lock.  The Exit takes
      both the driver and domain object lock, then drops the
      reference.  Adding calls to these Enter and Exit helpers
      around remote calls in the various migration methods
      seems to fix the problem for me in testing.
      
      This should make the situation safe. The additional domain
      object reference ensures that the domain object won't disappear
      while this operation is happening.  The BeginJob that is called
      inside of qemudDomainMigratePerform ensures that we can't execute a
      second migrate (or shutdown, or save, etc) job while the
      migration is active.  Finally, the additional check on the state
      of the vm after we reacquire the locks ensures that we can't
      be surprised by an external event (domain crash, etc).
      Signed-off-by: NChris Lalancette <clalance@redhat.com>
      f0c8e1cb
    • L
      fsync new storage volumes even if new volume was copied. · e0f26c46
      Laine Stump 提交于
      Originally the storage volume files were opened with O_DSYNC to make
      sure they were flushed to disk immediately. It turned out that this
      was extremely slow in some cases, so the O_DSYNC was removed in favor
      of just calling fsync() after all the data had been written. However,
      this call to fsync was inside the block that is executed to zero-fill
      the end of the volume file. In cases where the new volume is copied
      from an old volume, and they are the same length, this fsync would
      never take place.
      
      Now the fsync is *always* done, unless there is an error (in which
      case it isn't important, and is most likely inappropriate.
      e0f26c46
    • L
      Don't skip zero'ing end of volume file when inputvol is shorter than newvol · 35bebb57
      Laine Stump 提交于
      A missing set of braces around an error condition caused us to skip
      zero'ing out the remainder of a new volume file if the new volume was
      longer than the original (the goto was supposed to be taken only in
      the case of error, but was always being taken).
      35bebb57
    • D
      Use the extract backing store format in storage volume lookup · 187da82f
      Daniel P. Berrange 提交于
      The storage volume lookup code was probing for the backing store
      format, instead of using the format extracted from the file
      itself. This meant it could report in accurate information. If
      a format is included in the file, then use that in preference,
      with probing as a fallback.
      
      * src/storage/storage_backend_fs.c: Use extracted backing store
        format
      187da82f
    • D
      Rewrite qemu-img backing store format handling · 27f45438
      Daniel P. Berrange 提交于
      When creating qcow2 files with a backing store, it is important
      to set an explicit format to prevent QEMU probing. The storage
      backend was only doing this if it found a 'kvm-img' binary. This
      is wrong because plenty of kvm-img binaries don't support an
      explicit format, and plenty of 'qemu-img' binaries do support
      a format. The result was that most qcow2 files were not getting
      a backing store format.
      
      This patch runs 'qemu-img -h' to check for the two support
      argument formats
      
        '-o backing_format=raw'
        '-F raw'
      
      and use whichever option it finds
      
      * src/storage/storage_backend.c: Query binary to determine
        how to set the backing store format
      27f45438
    • D
      Add ability to set a default driver name/type when parsing disks · 03ca4204
      Daniel P. Berrange 提交于
      Record a default driver name/type in capabilities struct. Use this
      when parsing disks if value is not set in XML config.
      
      * src/conf/capabilities.h: Record default driver name/type for disks
      * src/conf/domain_conf.c: Fallback to default driver name/type
        when parsing disks
      * src/qemu/qemu_driver.c: Set default driver name/type to raw
      03ca4204
    • D
      Disable all disk probing in QEMU driver & add config option to re-enable · 68719c4b
      Daniel P. Berrange 提交于
      Disk format probing is now disabled by default. A new config
      option in /etc/qemu/qemu.conf will re-enable it for existing
      deployments where this causes trouble
      68719c4b
    • D
      Pass security driver object into all security driver callbacks · f70e0809
      Daniel P. Berrange 提交于
      The implementation of security driver callbacks often needs
      to access the security driver object. Currently only a handful
      of callbacks include the driver object as a parameter. Later
      patches require this is many more places.
      
      * src/qemu/qemu_driver.c: Pass in the security driver object
        to all callbacks
      * src/qemu/qemu_security_dac.c, src/qemu/qemu_security_stacked.c,
        src/security/security_apparmor.c, src/security/security_driver.h,
        src/security/security_selinux.c: Add a virSecurityDriverPtr
        param to all security callbacks
      f70e0809
    • D
      Convert all disk backing store loops to shared helper API · a8853344
      Daniel P. Berrange 提交于
      Update the QEMU cgroups code, QEMU DAC security driver, SELinux
      and AppArmour security drivers over to use the shared helper API
      virDomainDiskDefForeachPath().
      
      * src/qemu/qemu_driver.c, src/qemu/qemu_security_dac.c,
        src/security/security_selinux.c, src/security/virt-aa-helper.c:
        Convert over to use virDomainDiskDefForeachPath()
      a8853344
    • D
      Add an API for iterating over disk paths · 9d0a630f
      Daniel P. Berrange 提交于
      There is duplicated code which iterates over disk backing stores
      performing some action. Provide a convenient helper for doing
      this to eliminate duplication & risk of mistakes with disk format
      probing
      
      * src/conf/domain_conf.c, src/conf/domain_conf.h,
        src/libvirt_private.syms: Add virDomainDiskDefForeachPath()
      9d0a630f
    • D
      Require format to be passed into virStorageFileGetMetadata · bf80fc68
      Daniel P. Berrange 提交于
      Require the disk image to be passed into virStorageFileGetMetadata.
      If this is set to VIR_STORAGE_FILE_AUTO, then the format will be
      resolved using probing. This makes it easier to control when
      probing will be used
      
      * src/qemu/qemu_driver.c, src/qemu/qemu_security_dac.c,
        src/security/security_selinux.c, src/security/virt-aa-helper.c:
        Set VIR_STORAGE_FILE_AUTO when calling virStorageFileGetMetadata.
      * src/storage/storage_backend_fs.c: Probe for disk format before
        calling virStorageFileGetMetadata.
      * src/util/storage_file.h, src/util/storage_file.c: Remove format
        from virStorageFileMeta struct & require it to be passed into
        method.
      bf80fc68
  6. 19 7月, 2010 4 次提交
    • D
      Refactor virStorageFileGetMetadataFromFD to separate functionality · c70cb0f4
      Daniel P. Berrange 提交于
      The virStorageFileGetMetadataFromFD did two jobs in one. First
      it probed for storage type, then it extracted metadata for the
      type. It is desirable to be able to separate these jobs, allowing
      probing without querying metadata, and querying metadata without
      probing.
      
      To prepare for this, split out probing code into a new pair of
      methods
      
        virStorageFileProbeFormatFromFD
        virStorageFileProbeFormat
      
      * src/util/storage_file.c, src/util/storage_file.h,
        src/libvirt_private.syms: Introduce virStorageFileProbeFormat
        and virStorageFileProbeFormatFromFD
      c70cb0f4
    • D
      Remove 'type' field from FileTypeInfo struct · 779b6ea7
      Daniel P. Berrange 提交于
      Instead of including a field in FileTypeInfo struct for the
      disk format, rely on the array index matching the format.
      Use verify() to assert the correct number of elements in the
      array.
      
      * src/util/storage_file.c: remove type field from FileTypeInfo
      779b6ea7
    • D
      Extract the backing store format as well as name, if available · a93402d4
      Daniel P. Berrange 提交于
      When QEMU opens a backing store for a QCow2 file, it will
      normally auto-probe for the format of the backing store,
      rather than assuming it has the same format as the referencing
      file. There is a QCow2 extension that allows an explicit format
      for the backing store to be embedded in the referencing file.
      This closes the auto-probing security hole in QEMU.
      
      This backing store format can be useful for libvirt users
      of virStorageFileGetMetadata, so extract this data and report
      it.
      
      QEMU does not require disk image backing store files to be in
      the same format the file linkee. It will auto-probe the disk
      format for the backing store when opening it. If the backing
      store was intended to be a raw file this could be a security
      hole, because a guest may have written data into its disk that
      then makes the backing store look like a qcow2 file. If it can
      trick QEMU into thinking the raw file is a qcow2 file, it can
      access arbitrary files on the host by adding further backing
      store links.
      
      To address this, callers of virStorageFileGetMeta need to be
      told of the backing store format. If no format is declared,
      they can make a decision whether to allow format probing or
      not.
      a93402d4
    • D
      CVE-2010-2242 Apply a source port mapping to virtual network masquerading · c5678530
      Daniel P. Berrange 提交于
      IPtables will seek to preserve the source port unchanged when
      doing masquerading, if possible. NFS has a pseudo-security
      option where it checks for the source port <= 1023 before
      allowing a mount request. If an admin has used this to make the
      host OS trusted for mounts, the default iptables behaviour will
      potentially allow NAT'd guests access too. This needs to be
      stopped.
      
      With this change, the iptables -t nat -L -n -v rules for the
      default network will be
      
      Chain POSTROUTING (policy ACCEPT 95 packets, 9163 bytes)
       pkts bytes target     prot opt in     out     source               destination
         14   840 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535
         75  5752 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535
          0     0 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24
      
      * src/network/bridge_driver.c: Add masquerade rules for TCP
        and UDP protocols
      * src/util/iptables.c, src/util/iptables.c: Add source port
        mappings for TCP & UDP protocols when masquerading.
      c5678530
  7. 16 7月, 2010 1 次提交
    • D
      RFC: Canonicalize block device paths · ae3275c0
      David Allan 提交于
      There are many naming conventions for partitions associated with a
      block device.  Some of the major ones are:
      
      /dev/foo -> /dev/foo1
      /dev/foo1 -> /dev/foo1p1
      /dev/mapper/foo -> /dev/mapper/foop1
      /dev/disk/by-path/foo -> /dev/disk/by-path/foo-part1
      
      The universe of possible conventions isn't clear.  Rather than trying
      to understand all possible conventions, this patch divides devices
      into two groups, device mapper devices and everything else.  Device
      mapper devices seem always to follow the convention of device ->
      devicep1; everything else is canonicalized.
      ae3275c0
  8. 15 7月, 2010 2 次提交
  9. 14 7月, 2010 1 次提交
  10. 13 7月, 2010 3 次提交
    • J
      cpuCompare: Fix crash on unexpected CPU XML · f5055f23
      Jiri Denemark 提交于
      When comparing a CPU without <model> element, such as
      
          <cpu>
              <topology sockets='1' cores='1' threads='1'/>
          </cpu>
      
      libvirt would happily crash without warning.
      f5055f23
    • J
      cpu: Fail when CPU type cannot be detected from XML · 517aba9f
      Jiri Denemark 提交于
      When autodetecting whether XML describes guest or host CPU, the presence
      of <arch> element is checked. If it's present, we treat the XML as host
      CPU definition. Which is right, since guest CPU definitions do not
      contain <arch> element. However, if at the same time the root <cpu>
      element contains `match' attribute, we would silently ignore it and
      still treat the XML as host CPU. We should rather refuse such invalid
      XML.
      517aba9f
    • J
      cpuCompare: Fix comparison of two host CPUs · ac3daf08
      Jiri Denemark 提交于
      When a CPU to be compared with host CPU describes a host CPU instead of
      a guest CPU, the result is incorrect. This is because instead of
      treating additional features in host CPU description as required, they
      were treated as if they were mentioned with all possible policies at the
      same time.
      ac3daf08