qemu_agent: fix deadlock in qemuProcessHandleAgentEOF
If VM A is shutdown a by qemu agent at appoximately the same time an agent EOF of VM A happened, there's a chance that deadlock may occur: qemuProcessHandleAgentEOF in main thread A) priv->agent = NULL; //A happened before B //deadlock when we get agent lock which's held by worker thread qemuAgentClose(agent); qemuDomainObjExitAgent called by qemuDomainShutdownFlags in worker thread B) hasRefs = virObjectUnref(priv->agent); // priv->agent is NULL, // return false if (hasRefs) virObjectUnlock(priv->agent); //agent lock will not be released here In order to resolve, during EOF close the agent first, then set priv->agent to NULL to fix the deadlock. This essentially reverts commit id '1020a504'. It's also of note that commit id '362d0477' notes a possible/rare deadlock similar to what was seen in the monitor in commit id '25f582e3'. However, it seems interceding changes including commit id 'd960d06f' should remove the deadlock issue. With this change, if EOF is called: Get VM lock Check if !priv->agent || priv->beingDestroyed, then unlock VM Call qemuAgentClose Unlock VM When qemuAgentClose is called Get Agent lock If Agent->fd open, close it Unlock Agent Unref Agent qemuDomainObjEnterAgent Enter with VM lock Get Agent lock Increase Agent refcnt Unlock VM After running agent command, calling qemuDomainObjExitAgent Enter with Agent lock Unref Agent If not last reference, unlock Agent Get VM lock If we were in the middle of an EnterAgent, call Agent command, and ExitAgent sequence and the EOF code is triggered, then the EOF code can get the VM lock, make it's checks against !priv->agent || priv->beingDestroyed, and call qemuAgentClose. The CloseAgent would wait to get agent lock. The other thread then will eventually call ExitAgent, release the Agent lock and unref the Agent. Once ExitAgent releases the Agent lock, AgentClose will get the Agent Agent lock, close the fd, unlock the agent, and unref the agent. The final unref would cause deletion of the agent. Signed-off-by: NWang Yufei <james.wangyufei@huawei.com> Reviewed-by: NRen Guannan <renguannan@huawei.com>
Showing
想要评论请 注册 或 登录