events: Fix domain event race on client disconnect
GNOME Boxes sometimes stops getting domain events from libvirtd, even after restarting it. Further investigation in libvirtd shows that events are properly queued with virDomainEventStateQueue, but the timer virDomainEventTimer which flushes the events and sends them to the clients never gets called. Looking at the event queue in gdb shows that it's non-empty and that its size increases with each new events. virDomainEventTimer is set up in virDomainEventStateRegister[ID] when going from 0 client connecte to 1 client connected, but is initially disabled. The timer is removed in virDomainEventStateRegister[ID] when the last client is disconnected (going from 1 client connected to 0). This timer (which handles sending the events to the clients) is enabled in virDomainEventStateQueue when queueing an event on an empty queue (queue containing 0 events). It's disabled in virDomainEventStateFlush after flushing the queue (ie removing all the elements from it). This way, no extra work is done when the queue is empty, and when the next event comes up, the timer will get reenabled because the queue will go from 0 event to 1 event, which triggers enabling the timer. However, with this Boxes bug, we have a client connected (Boxes), a non-empty queue (there are events waiting to be sent), but a disabled timer, so something went wrong. When Boxes connects (it's the only client connecting to the libvirtd instance I used for debugging), the event timer is not set as expected (state->timer == -1 when virDomainEventStateRegisterID is called), but at the same time the event queue is not empty. In other words, we had no clients connected, but pending events. This also explains why the timer never gets enabled as this is only done when an event is queued on an empty queue. I think this can happen if an event gets queued using virDomainEventStateQueue and the client disconnection happens before the event timer virDomainEventTimer gets a chance to run and flush the event. In this situation, virDomainEventStateDeregister[ID] will get called with a non-empty event queue, the timer will be destroyed if this was the only client connected. Then, when other clients connect at a later time, they will never get notified about domain events as the event timer will never get enabled because the timer is only enabled if the event queue is empty when virDomainEventStateRegister[ID] gets called, which will is no longer the case. To avoid this issue, this commit makes sure to remove all events from the event queue when the last client in unregistered. As there is no longer anyone interested in receiving these events, these events are stale so there is no need to keep them around. A client connecting later will have no interest in getting events that happened before it got connected.
Showing
想要评论请 注册 或 登录