• W
    qemu_agent: fix deadlock in qemuProcessHandleAgentEOF · fe51174f
    Wang Yufei 提交于
    If VM A is shutdown a by qemu agent at appoximately the same time
    an agent EOF of VM A happened, there's a chance that deadlock may occur:
    
    qemuProcessHandleAgentEOF in main thread
    A)  priv->agent = NULL; //A happened before B
    
        //deadlock when we get agent lock which's held by worker thread
        qemuAgentClose(agent);
    
    qemuDomainObjExitAgent called by qemuDomainShutdownFlags in worker thread
    B)  hasRefs = virObjectUnref(priv->agent); // priv->agent is NULL,
                                               // return false
        if (hasRefs)
            virObjectUnlock(priv->agent); //agent lock will not be released here
    
    In order to resolve, during EOF close the agent first, then set priv->agent
    to NULL to fix the deadlock.
    
    This essentially reverts commit id '1020a504'. It's also of note that commit
    id '362d0477' notes a possible/rare deadlock similar to what was seen in
    the monitor in commit id '25f582e3'. However, it seems interceding changes
    including commit id 'd960d06f' should remove the deadlock issue.
    
    With this change, if EOF is called:
    
        Get VM lock
        Check if !priv->agent || priv->beingDestroyed, then unlock VM
        Call qemuAgentClose
        Unlock VM
    
    When qemuAgentClose is called
        Get Agent lock
        If Agent->fd open, close it
        Unlock Agent
        Unref Agent
    
    qemuDomainObjEnterAgent
        Enter with VM lock
        Get Agent lock
        Increase Agent refcnt
        Unlock VM
    
    After running agent command, calling qemuDomainObjExitAgent
        Enter with Agent lock
        Unref Agent
        If not last reference, unlock Agent
        Get VM lock
    
    If we were in the middle of an EnterAgent, call Agent command, and
    ExitAgent sequence and the EOF code is triggered, then the EOF code
    can get the VM lock, make it's checks against !priv->agent ||
    priv->beingDestroyed, and call qemuAgentClose. The CloseAgent
    would wait to get agent lock. The other thread then will eventually
    call ExitAgent, release the Agent lock and unref the Agent. Once
    ExitAgent releases the Agent lock, AgentClose will get the Agent
    Agent lock, close the fd, unlock the agent, and unref the agent.
    The final unref would cause deletion of the agent.
    Signed-off-by: NWang Yufei <james.wangyufei@huawei.com>
    Reviewed-by: NRen Guannan <renguannan@huawei.com>
    fe51174f
qemu_process.c 181.3 KB