• C
    events: Fix domain event race on client disconnect · defa8b85
    Christophe Fergeau 提交于
    GNOME Boxes sometimes stops getting domain events from libvirtd, even
    after restarting it. Further investigation in libvirtd shows that
    events are properly queued with virDomainEventStateQueue, but the
    timer virDomainEventTimer which flushes the events and sends them to
    the clients never gets called. Looking at the event queue in gdb
    shows that it's non-empty and that its size increases with each new
    events.
    
    virDomainEventTimer is set up in virDomainEventStateRegister[ID]
    when going from 0 client connecte to 1 client connected, but is
    initially disabled. The timer is removed in
    virDomainEventStateRegister[ID] when the last client is disconnected
    (going from 1 client connected to 0).
    
    This timer (which handles sending the events to the clients) is
    enabled in virDomainEventStateQueue when queueing an event on an
    empty queue (queue containing 0 events). It's disabled in
    virDomainEventStateFlush after flushing the queue (ie removing all
    the elements from it). This way, no extra work is done when the queue
    is empty, and when the next event comes up, the timer will get
    reenabled because the queue will go from 0 event to 1 event, which
    triggers enabling the timer.
    
    However, with this Boxes bug, we have a client connected (Boxes), a
    non-empty queue (there are events waiting to be sent), but a disabled
    timer, so something went wrong.
    
    When Boxes connects (it's the only client connecting to the libvirtd
    instance I used for debugging), the event timer is not set as expected
    (state->timer == -1 when virDomainEventStateRegisterID is called),
    but at the same time the event queue is not empty. In other words,
    we had no clients connected, but pending events. This also explains
    why the timer never gets enabled as this is only done when an event
    is queued on an empty queue.
    
    I think this can happen if an event gets queued using
    virDomainEventStateQueue and the client disconnection happens before
    the event timer virDomainEventTimer gets a chance to run and flush
    the event. In this situation, virDomainEventStateDeregister[ID] will
    get called with a non-empty event queue, the timer will be destroyed
    if this was the only client connected. Then, when other clients connect
    at a later time, they will never get notified about domain events as
    the event timer will never get enabled because the timer is only
    enabled if the event queue is empty when virDomainEventStateRegister[ID]
    gets called, which will is no longer the case.
    
    To avoid this issue, this commit makes sure to remove all events from
    the event queue when the last client in unregistered. As there is
    no longer anyone interested in receiving these events, these events
    are stale so there is no need to keep them around. A client connecting
    later will have no interest in getting events that happened before it
    got connected.
    defa8b85
domain_event.c 50.7 KB