• N
    vz: make vz driver more responsive · 84299235
    Nikolay Shirokovskiy 提交于
    Current vz driver implementation is not usable when it comes to
    long runnig operations. Migration or saving a domain blocks all
    other operations even query ones which are expecteted to be available.
    This patch addresses this problem.
    
    All vz driver API calls fall into next 3 groups:
    1. only query domain cache (virDomainObj, vz cache statistic)
       examples are vzDomainGetState, vzDomainGetXMLDesc etc.
    2. use thread shared sdkdom object
       examples are vzDomainSetMemoryFlags, vzDomainAttachDevice etc.
    3. use no thread shared sdkdom object nor domain cache
       examples are vzDomainSnapshotListNames, vzDomainSnapshotGetXMLDesc etc
    
    API calls from group 1 don't need to be changed as they hold domain lock only
    for short period of time. These calls [1] are easily distinguished. They query
    domain object thru libvirt common code or query vz sdk statistics handle thru
    vz sdk sync operations.
    
    vzDomainInterfaceStats is the only exception. It uses sdkdom object to
    convert interface name to its vz sdk stack index which could not be saved in
    domain cache. Interface statistics is available thru this stack index as a key
    rather than name. As a result we can have accidental 'not known interface'
    errors on quering intrerface stats. The reason is that in the process of
    updating domain configuration we drop all devices and then recreate them again
    in sdkdom object and domain lock can be dropped meanwhile (to remove networks
    for existing bridged interfaces and(or) (re)create new ones). We can fix this
    by changing the way we support bridged interfaces or by reordering operations
    and changing bridged networks beforehand. Anyway this is better than moving
    this API call into 2 group and making it an exclusive job.
    
    As to API calls from group 2, first thread shared sdkdom object needs to be
    explained. vz sdk has only one handle for a given domain, thus threads need
    exclusive access to operate on it. These calls are fixed to drop and reacquire
    domain lock on any lengthy operations - namely waiting the result of async vz
    sdk operation. As lock is dropped we need to take extra reference to domain
    object if it is not taken already as domain object can be deleted from list
    while lock is dropped. As this operations use thread shared sdkdom object, the
    simplest way to make calls from group 2 be consistent to each other is to make
    them mutually exclusive. This is done by taking/releasing job condition thru
    calling correspondent job routine. This approach makes group 1 and group
    2 calls consistent to each other too. Not all calls of group 2 change the
    domain cache but those that do update it thru prlsdkUpdateDomain which holds
    the lock thoughout the update.
    
    API calls from group [2] are easily distinguished too. They use
    beginEdit/commit to change domain configuration (vzDomainSetMemoryFlags) or/and
    update domain cache from sdkdom at the end of operation (vzDomainSuspend).
    
    There is a known issue however. Frankly speaking it was introduced by ealier
    patch '[PATCH 6/9] vz: cleanup loading domain code' from a different series.
    The patch significantly reduced amount of time when the driver lock is held when
    creating domain from API call or as a result of domain added event from vz sdk.
    The problem is these two paths race on using thread shared sdkdom as we don't
    have libvirt domain object and can not lock on it. However this don't
    invalidates the patch as we can't use the former approach of preadding domain
    into the list as we need name at least and name is not given by event. Anyway
    i'm against adding half baked object into the list. Eventually this race can be
    fixed by extra measures. As to current situation races with different
    configurations are unlikely and race when adding domain thru vz driver and
    simultaneous event from vz sdk is not dangerous as configuration is the same.
    
    The last group [3] is API calls that need only sdkdom object to make vz sdk
    call and don't change thread shared sdkdom object or domain cache in any way.
    For now these are mostly domain snapshot API calls. The changes are similar to
    those of group 2 - they add extra reference and drop/reacquire the lock on waiting
    vz async call result. One can simply take the immutable sdkdom object from the
    cache and drop the lock for the rest of operations but the chosen approach
    makes implementation of these API calls somewhat similar to those of from group
    2 and thus a bit futureproof. As calls of group 3 don't need vz driver
    domain/vz sdk cache in any way, they are consistent with respect to API calls from
    groups 1 and 3.
    
    There is another exception. Calls to make-snapshot/revert-to-snapshot/migrate
    are moved to group 2. That is they are made mutually exclusive. The reason
    is that libvirt API supports control/query only for one job per domain and
    these are jobs that are likely to be queried/aborted.
    
    Appendix.
    
    [1] API calls that only query domain cache.
    (marked [*] are included for a different reason)
    
    .domainLookupByID = vzDomainLookupByID,    /* 0.10.0 */
    .domainLookupByUUID = vzDomainLookupByUUID,        /* 0.10.0 */
    .domainLookupByName = vzDomainLookupByName,        /* 0.10.0 */
    .domainGetOSType = vzDomainGetOSType,    /* 0.10.0 */
    .domainGetInfo = vzDomainGetInfo,  /* 0.10.0 */
    .domainGetState = vzDomainGetState,        /* 0.10.0 */
    .domainGetXMLDesc = vzDomainGetXMLDesc,    /* 0.10.0 */
    .domainIsPersistent = vzDomainIsPersistent,        /* 0.10.0 */
    .domainGetAutostart = vzDomainGetAutostart,        /* 0.10.0 */
    .domainGetVcpus = vzDomainGetVcpus, /* 1.2.6 */
    .domainIsActive = vzDomainIsActive, /* 1.2.10 */
    .domainIsUpdated = vzDomainIsUpdated,     /* 1.2.21 */
    .domainGetVcpusFlags = vzDomainGetVcpusFlags, /* 1.2.21 */
    .domainGetMaxVcpus = vzDomainGetMaxVcpus, /* 1.2.21 */
    .domainHasManagedSaveImage = vzDomainHasManagedSaveImage, /* 1.2.13 */
    .domainGetMaxMemory = vzDomainGetMaxMemory, /* 1.2.15 */
    .domainBlockStats = vzDomainBlockStats, /* 1.2.17 */
    .domainBlockStatsFlags = vzDomainBlockStatsFlags, /* 1.2.17 */
    .domainInterfaceStats = vzDomainInterfaceStats, /* 1.2.17 */                   [*]
    .domainMemoryStats = vzDomainMemoryStats, /* 1.2.17 */
    .domainMigrateBegin3Params = vzDomainMigrateBegin3Params, /* 1.3.5 */
    .domainMigrateConfirm3Params = vzDomainMigrateConfirm3Params, /* 1.3.5 */
    
    [2] API calls that use thread shared sdkdom object
    (marked [*] are included for a different reason)
    
    .domainSuspend = vzDomainSuspend,    /* 0.10.0 */
    .domainResume = vzDomainResume,    /* 0.10.0 */
    .domainDestroy = vzDomainDestroy,  /* 0.10.0 */
    .domainShutdown = vzDomainShutdown, /* 0.10.0 */
    .domainCreate = vzDomainCreate,    /* 0.10.0 */
    .domainCreateWithFlags = vzDomainCreateWithFlags, /* 1.2.10 */
    .domainReboot = vzDomainReboot, /* 1.3.0 */
    .domainDefineXML = vzDomainDefineXML,      /* 0.10.0 */
    .domainDefineXMLFlags = vzDomainDefineXMLFlags, /* 1.2.12 */ (update part)
    .domainUndefine = vzDomainUndefine, /* 1.2.10 */
    .domainAttachDevice = vzDomainAttachDevice, /* 1.2.15 */
    .domainAttachDeviceFlags = vzDomainAttachDeviceFlags, /* 1.2.15 */
    .domainDetachDevice = vzDomainDetachDevice, /* 1.2.15 */
    .domainDetachDeviceFlags = vzDomainDetachDeviceFlags, /* 1.2.15 */
    .domainSetUserPassword = vzDomainSetUserPassword, /* 1.3.6 */
    .domainManagedSave = vzDomainManagedSave, /* 1.2.14 */
    .domainSetMemoryFlags = vzDomainSetMemoryFlags, /* 1.3.4 */
    .domainSetMemory = vzDomainSetMemory, /* 1.3.4 */
    .domainRevertToSnapshot = vzDomainRevertToSnapshot, /* 1.3.5 */                  [*]
    .domainSnapshotCreateXML = vzDomainSnapshotCreateXML, /* 1.3.5 */                [*]
    .domainMigratePerform3Params = vzDomainMigratePerform3Params, /* 1.3.5 */        [*]
    .domainUpdateDeviceFlags = vzDomainUpdateDeviceFlags, /* 2.0.0 */
    prlsdkHandleVmConfigEvent
    
    [3] API calls that do not use thread shared sdkdom object
    
    .domainManagedSaveRemove = vzDomainManagedSaveRemove, /* 1.2.14 */
    .domainSnapshotNum = vzDomainSnapshotNum, /* 1.3.5 */
    .domainSnapshotListNames = vzDomainSnapshotListNames, /* 1.3.5 */
    .domainListAllSnapshots = vzDomainListAllSnapshots, /* 1.3.5 */
    .domainSnapshotGetXMLDesc = vzDomainSnapshotGetXMLDesc, /* 1.3.5 */
    .domainSnapshotNumChildren = vzDomainSnapshotNumChildren, /* 1.3.5 */
    .domainSnapshotListChildrenNames = vzDomainSnapshotListChildrenNames, /* 1.3.5 */
    .domainSnapshotListAllChildren = vzDomainSnapshotListAllChildren, /* 1.3.5 */
    .domainSnapshotLookupByName = vzDomainSnapshotLookupByName, /* 1.3.5 */
    .domainHasCurrentSnapshot = vzDomainHasCurrentSnapshot, /* 1.3.5 */
    .domainSnapshotGetParent = vzDomainSnapshotGetParent, /* 1.3.5 */
    .domainSnapshotCurrent = vzDomainSnapshotCurrent, /* 1.3.5 */
    .domainSnapshotIsCurrent = vzDomainSnapshotIsCurrent, /* 1.3.5 */
    .domainSnapshotHasMetadata = vzDomainSnapshotHasMetadata, /* 1.3.5 */
    .domainSnapshotDelete = vzDomainSnapshotDelete, /* 1.3.5 */
    
    [4] Known issues.
    
    1. accidental errors on getting network statistics
    2. race with simultaneous use of thread shared domain object on paths
     of adding domain thru API and adding domain on vz sdk domain added event.
    84299235
vz_utils.h 4.5 KB