-
由 Nikolay Shirokovskiy 提交于
Current vz driver implementation is not usable when it comes to long runnig operations. Migration or saving a domain blocks all other operations even query ones which are expecteted to be available. This patch addresses this problem. All vz driver API calls fall into next 3 groups: 1. only query domain cache (virDomainObj, vz cache statistic) examples are vzDomainGetState, vzDomainGetXMLDesc etc. 2. use thread shared sdkdom object examples are vzDomainSetMemoryFlags, vzDomainAttachDevice etc. 3. use no thread shared sdkdom object nor domain cache examples are vzDomainSnapshotListNames, vzDomainSnapshotGetXMLDesc etc API calls from group 1 don't need to be changed as they hold domain lock only for short period of time. These calls [1] are easily distinguished. They query domain object thru libvirt common code or query vz sdk statistics handle thru vz sdk sync operations. vzDomainInterfaceStats is the only exception. It uses sdkdom object to convert interface name to its vz sdk stack index which could not be saved in domain cache. Interface statistics is available thru this stack index as a key rather than name. As a result we can have accidental 'not known interface' errors on quering intrerface stats. The reason is that in the process of updating domain configuration we drop all devices and then recreate them again in sdkdom object and domain lock can be dropped meanwhile (to remove networks for existing bridged interfaces and(or) (re)create new ones). We can fix this by changing the way we support bridged interfaces or by reordering operations and changing bridged networks beforehand. Anyway this is better than moving this API call into 2 group and making it an exclusive job. As to API calls from group 2, first thread shared sdkdom object needs to be explained. vz sdk has only one handle for a given domain, thus threads need exclusive access to operate on it. These calls are fixed to drop and reacquire domain lock on any lengthy operations - namely waiting the result of async vz sdk operation. As lock is dropped we need to take extra reference to domain object if it is not taken already as domain object can be deleted from list while lock is dropped. As this operations use thread shared sdkdom object, the simplest way to make calls from group 2 be consistent to each other is to make them mutually exclusive. This is done by taking/releasing job condition thru calling correspondent job routine. This approach makes group 1 and group 2 calls consistent to each other too. Not all calls of group 2 change the domain cache but those that do update it thru prlsdkUpdateDomain which holds the lock thoughout the update. API calls from group [2] are easily distinguished too. They use beginEdit/commit to change domain configuration (vzDomainSetMemoryFlags) or/and update domain cache from sdkdom at the end of operation (vzDomainSuspend). There is a known issue however. Frankly speaking it was introduced by ealier patch '[PATCH 6/9] vz: cleanup loading domain code' from a different series. The patch significantly reduced amount of time when the driver lock is held when creating domain from API call or as a result of domain added event from vz sdk. The problem is these two paths race on using thread shared sdkdom as we don't have libvirt domain object and can not lock on it. However this don't invalidates the patch as we can't use the former approach of preadding domain into the list as we need name at least and name is not given by event. Anyway i'm against adding half baked object into the list. Eventually this race can be fixed by extra measures. As to current situation races with different configurations are unlikely and race when adding domain thru vz driver and simultaneous event from vz sdk is not dangerous as configuration is the same. The last group [3] is API calls that need only sdkdom object to make vz sdk call and don't change thread shared sdkdom object or domain cache in any way. For now these are mostly domain snapshot API calls. The changes are similar to those of group 2 - they add extra reference and drop/reacquire the lock on waiting vz async call result. One can simply take the immutable sdkdom object from the cache and drop the lock for the rest of operations but the chosen approach makes implementation of these API calls somewhat similar to those of from group 2 and thus a bit futureproof. As calls of group 3 don't need vz driver domain/vz sdk cache in any way, they are consistent with respect to API calls from groups 1 and 3. There is another exception. Calls to make-snapshot/revert-to-snapshot/migrate are moved to group 2. That is they are made mutually exclusive. The reason is that libvirt API supports control/query only for one job per domain and these are jobs that are likely to be queried/aborted. Appendix. [1] API calls that only query domain cache. (marked [*] are included for a different reason) .domainLookupByID = vzDomainLookupByID, /* 0.10.0 */ .domainLookupByUUID = vzDomainLookupByUUID, /* 0.10.0 */ .domainLookupByName = vzDomainLookupByName, /* 0.10.0 */ .domainGetOSType = vzDomainGetOSType, /* 0.10.0 */ .domainGetInfo = vzDomainGetInfo, /* 0.10.0 */ .domainGetState = vzDomainGetState, /* 0.10.0 */ .domainGetXMLDesc = vzDomainGetXMLDesc, /* 0.10.0 */ .domainIsPersistent = vzDomainIsPersistent, /* 0.10.0 */ .domainGetAutostart = vzDomainGetAutostart, /* 0.10.0 */ .domainGetVcpus = vzDomainGetVcpus, /* 1.2.6 */ .domainIsActive = vzDomainIsActive, /* 1.2.10 */ .domainIsUpdated = vzDomainIsUpdated, /* 1.2.21 */ .domainGetVcpusFlags = vzDomainGetVcpusFlags, /* 1.2.21 */ .domainGetMaxVcpus = vzDomainGetMaxVcpus, /* 1.2.21 */ .domainHasManagedSaveImage = vzDomainHasManagedSaveImage, /* 1.2.13 */ .domainGetMaxMemory = vzDomainGetMaxMemory, /* 1.2.15 */ .domainBlockStats = vzDomainBlockStats, /* 1.2.17 */ .domainBlockStatsFlags = vzDomainBlockStatsFlags, /* 1.2.17 */ .domainInterfaceStats = vzDomainInterfaceStats, /* 1.2.17 */ [*] .domainMemoryStats = vzDomainMemoryStats, /* 1.2.17 */ .domainMigrateBegin3Params = vzDomainMigrateBegin3Params, /* 1.3.5 */ .domainMigrateConfirm3Params = vzDomainMigrateConfirm3Params, /* 1.3.5 */ [2] API calls that use thread shared sdkdom object (marked [*] are included for a different reason) .domainSuspend = vzDomainSuspend, /* 0.10.0 */ .domainResume = vzDomainResume, /* 0.10.0 */ .domainDestroy = vzDomainDestroy, /* 0.10.0 */ .domainShutdown = vzDomainShutdown, /* 0.10.0 */ .domainCreate = vzDomainCreate, /* 0.10.0 */ .domainCreateWithFlags = vzDomainCreateWithFlags, /* 1.2.10 */ .domainReboot = vzDomainReboot, /* 1.3.0 */ .domainDefineXML = vzDomainDefineXML, /* 0.10.0 */ .domainDefineXMLFlags = vzDomainDefineXMLFlags, /* 1.2.12 */ (update part) .domainUndefine = vzDomainUndefine, /* 1.2.10 */ .domainAttachDevice = vzDomainAttachDevice, /* 1.2.15 */ .domainAttachDeviceFlags = vzDomainAttachDeviceFlags, /* 1.2.15 */ .domainDetachDevice = vzDomainDetachDevice, /* 1.2.15 */ .domainDetachDeviceFlags = vzDomainDetachDeviceFlags, /* 1.2.15 */ .domainSetUserPassword = vzDomainSetUserPassword, /* 1.3.6 */ .domainManagedSave = vzDomainManagedSave, /* 1.2.14 */ .domainSetMemoryFlags = vzDomainSetMemoryFlags, /* 1.3.4 */ .domainSetMemory = vzDomainSetMemory, /* 1.3.4 */ .domainRevertToSnapshot = vzDomainRevertToSnapshot, /* 1.3.5 */ [*] .domainSnapshotCreateXML = vzDomainSnapshotCreateXML, /* 1.3.5 */ [*] .domainMigratePerform3Params = vzDomainMigratePerform3Params, /* 1.3.5 */ [*] .domainUpdateDeviceFlags = vzDomainUpdateDeviceFlags, /* 2.0.0 */ prlsdkHandleVmConfigEvent [3] API calls that do not use thread shared sdkdom object .domainManagedSaveRemove = vzDomainManagedSaveRemove, /* 1.2.14 */ .domainSnapshotNum = vzDomainSnapshotNum, /* 1.3.5 */ .domainSnapshotListNames = vzDomainSnapshotListNames, /* 1.3.5 */ .domainListAllSnapshots = vzDomainListAllSnapshots, /* 1.3.5 */ .domainSnapshotGetXMLDesc = vzDomainSnapshotGetXMLDesc, /* 1.3.5 */ .domainSnapshotNumChildren = vzDomainSnapshotNumChildren, /* 1.3.5 */ .domainSnapshotListChildrenNames = vzDomainSnapshotListChildrenNames, /* 1.3.5 */ .domainSnapshotListAllChildren = vzDomainSnapshotListAllChildren, /* 1.3.5 */ .domainSnapshotLookupByName = vzDomainSnapshotLookupByName, /* 1.3.5 */ .domainHasCurrentSnapshot = vzDomainHasCurrentSnapshot, /* 1.3.5 */ .domainSnapshotGetParent = vzDomainSnapshotGetParent, /* 1.3.5 */ .domainSnapshotCurrent = vzDomainSnapshotCurrent, /* 1.3.5 */ .domainSnapshotIsCurrent = vzDomainSnapshotIsCurrent, /* 1.3.5 */ .domainSnapshotHasMetadata = vzDomainSnapshotHasMetadata, /* 1.3.5 */ .domainSnapshotDelete = vzDomainSnapshotDelete, /* 1.3.5 */ [4] Known issues. 1. accidental errors on getting network statistics 2. race with simultaneous use of thread shared domain object on paths of adding domain thru API and adding domain on vz sdk domain added event.
84299235